Triaster uses the Keyoti search system to index files and present results in a process library site’s website. When trouble-shooting Search issues, there are some key files that can help.
Triaster\TriasterServer2011\KeyotiSearch\ IndexDirectory\ Numerically-named index files Indexer.txt lock ParserProvider.txt Reader.txt IndexLog.txt
When a re-index is run as a post-publish task, actions are logged in ‘IndexLog.txt’. Useful information includes:
The timestamp on the file itself will also indicate whether indexing has happened when expected. An old timestamp would suggest a problem.
Errors for specific sources aren’t unusual. There are commonly files in searchable locations that cannot be handled, and these are often identified in the ‘Indexer.txt’ and ‘Reader.txt’ files, more details of which will be given later.
The index files are a set of numerically-named files. In a complete index, the numbers would be the same, but the files would have different extensions. File-names with different numbers would suggest either an index is currently underway, or that one had crashed. The timestamps on those files should indicate which.
If indexing has crashed, then the index files should be deleted. A corrupt index can prevent further indexing.
These contain records of files, as represented by their HTTP URLs.
Problems reading and indexing files are likely to be identified here. Often, these are of no concern, perhaps system files such as ‘thumbs.db’ files that Windows Explorer uses.
However, there may be characters in a path that are forbidden in URLs, preventing the file from being indexed. For example, by default, ‘+’ isn’t allowed in a URL by IIS (Microsoft Windows’ web serving engine).
A document with a ‘+’ in its file name won’t appear in Search results. In this case, either the document is renamed to remove problem characters, or ‘double-escaping’ is enabled in IIS to allow their use in URLs.
When indexing is running, a ‘lock’ file is written to the ‘IndexDirectory’ folder. This will prevent another reindex from being initiated. When indexing is complete, it should be deleted. However, if indexing crashes, it’s likely to remain. Its timestamp should indicate whether it represents current indexing action.
If indexing has crashed, this file should be deleted.
This file identifies MIME-types that are not recognised. A MIME type is used by a web server as a way of identifying a file based on its nature and format, and will determine how that web server serves a file.
We’ve only encountered one related issue where documents were on a different server. That server’s MIME-type configurations for some types of file weren’t in accordance with standard definitions, and those types of file weren’t being returned in Search results. This was apparent from the ‘ParserProvider.txt’ file. Correcting the MIME-types on the document-hosting server resolved the issue.
When investigating Search behaviour, these files in particular can offer helpful information:
Timestamps on files will identify the currency of indexing.
To remove a corrupted index and ensure subsequent indexing can run, delete these files from the ‘IndexDirectory’ folder: