Tech News

Indexing maps and documents for Quick Search

Home

Maps and documents associated with a process library site are typically indexed at the end of a site publication so that Search results accurately reflect the newly-published site. However, there may be a large number of documents, which take a while to index, adding significant time to the publication. It may be better to index only process maps when publishing, and index documents at another time, perhaps overnight. This can be achieved with some relatively simple changes.

Indexed locations

The locations indexed are defined as Data Sources In Keyoti Search.


Triaster\TriasterServer2011\KeyotiSearch\
IndexDirectory\
indexableSourceRecord.xml

The Data Sources related to a library site may be:



 <DataSource ID=”15” type=”FileSystemDocumentStore” location=”C:\Triaster\TriasterServer2011\ProcessLibraries\Process Library\Sandpit\html\” query=”http://MyHost/ProcessLibraries%202011/Process%20Library/Sandpit/html/” uniqueColumn=”” resultURL=”” ExtensionData=”@_files@True@”>
 <t:Categories>
 <t:Category>Process Maps </t:Category>
 </t:Categories>
 <t:Locations>
 <t:Location>process library\sandpit:desktop </t:Location>
 </t:Locations>
 </DataSource>
...
 <DataSource ID=”28” type=”FileSystemDocumentStore” location=”C:\Triaster\Documents\”query=”http://MyHost/Documents/” uniqueColumn=”” resultURL=”” ExtensionData=”@_files@True@”>
 <t:Categories>
 <t:Category>Documents </t:Category>
 </t:Categories>
 <t:Locations />
 </DataSource>


In the first Data Source, a single Location is defined, which associates this Data Source with the Process Library Sandpit Desktop site only. In the second one, there is no Location specified, which means this Data Source is associated with all library sites.

Note the Categories, which will be mentioned later. By specifying a Category, results from a Data Source will be displayed on a Category-named tab in the Search results. (If other than the default ‘Process Maps’ or ‘Documents’, a Category tab will also need to be configured for a site in the Settings file.)

Indexing

Indexing is run typically from executables in the KeyotiSearch folder.


Triaster\TriasterServer2011\KeyotiSearch\
KeyotiReindex.exe
PostPublishReindex.cmd
Reindex All.cmd
Reindex Documents.cmd

KeyotiReindex.exe

If run directly, this would index all Data Sources.

Indexing can be more specific if run from CMD files that pass suitable parameters to KeyotiReindex.exe. Some such files are installed with Triaster Server, while others have been created separately.

PostPublishReindex.cmd

This is commonly run as a post-publish task, defined for each library site in the Settings file. By default, the indexing command is:


KeyotiReindex.exe /l:”%Library%” /s:”%Stage%”
>”%LogFile%”
/l - library name
/s - stage (or site) name

When run as a post-publish task, the library and site are passed as arguments to this script. That command will index all Data Sources related to that library site.

Indexing could be filtered further by specifying Categories, either for inclusion or exclusion.


e.g. to index only process maps:
KeyotiReindex.exe /l:”%Library%” /s:”%Stage%”
/c:”Process Maps” >”%LogFile%”
e.g. to exclude documents:
KeyotiReindex.exe /l:”%Library%” /s:”%Stage%” /
ec:”Documents” >”%LogFile%”

If there are only two Categories - ‘Process Maps’ and ‘Documents’ - the effect of the examples above would be the same. Where there may be other Categories, perhaps different types of documents, and where others may be added in future, excluding documents may be better achieved by the former example, which specifies ‘Process Maps’ only. The alternative would be to include a comma-delimited list of document-related Categories, e.g.


KeyotiReindex.exe /l:”%Library%” /s:”%Stage%”
/c:”Forms,Scripts,Work Instructions” >”%LogFile%”

Reindex Documents.cmd

If documents are to be excluded from the post-publish index, there needs to be another mechanism to index them. This could be a separate CMD file that’s run from a scheduled task. Commands could be:


KeyotiReindex.exe /c:”Documents” >”%LogFile%”
Or:
KeyotiReindex.exe /ec:”Process Maps” >”%LogFile%”

Reindex All.cmd

This script could be used to index all Data Sources. It would only be used when logged on directly to the server. It has the same effect as running ‘KeyotiReindex.exe’, but also records its progress in a log file.


KeyotiReindex.exe > “%LogFile%”

A customer example

While working on a customer’s system recently, it was noted that with a standard post-publication re-index, indexing maps took about 4 minutes, whereas indexing documents took about 23 minutes. However few changes there had been to the maps, a publication would take at least the time of that re-index. By reconfiguring re-indexing, so that only process maps were indexed in the post-publish re-index, there was a significant saving in the publication time. Re-indexing documents became a scheduled task, timed for when the system was unlikely to be otherwise busy.

Register to receive product release notifications

SIGN UP FOR CONNECTOR

Sign up for Connector
Industry best practice and knowledge in our ‘best of breed’ newsletter.
Published quarterly.

Signup here