Tech News

Document Categories in Search

Home

Displaying search results

Quick Search results are commonly displayed on two separate tabs: one for process maps, the other for documents.

However, you may want to separate your documents into different categories, and have Quick Search present results separately for those categories.

Documents folder structure

You may have documents organised in this way:

\Documents\
   Forms\
   Scripts\
   Work Instructions\
   

or perhaps this:

\Documents\
   Forms\
   Scripts\
   Work Instructions\
   Miscellaneous\

The pertinent difference is whether documents that aren't in a particular category are in a common root folder that also contains those specific document categories. This is the case in the former example, but not the latter. The significance of this will be explained as the different implementation configurations are described.

Configuring Triaster Server's search

Scenario 1

With the former document folder structure.

\Documents
   Forms\
   Scripts\
   Work Instructions\
   

There could be four different categories, each of which would be defined as a DataSource in the Keyoti search engine's configuration file.

Triaster\TriasterServer2011\KeyotiSearch\IndexDirectory\indexableSourceRecord.xml


<DataSource ID="43" type="FileSystemDocumentStore" location="C:\Triaster\Documents\" query="http://TriasterTest1/Documents/" uniqueColumn="" resultURL="" ExtensionData="@forms|scripts|work instructions@True@">
 <t:Categories>
 <t:Category>Documents</t:Category>
 </t:Categories>
 <t:Locations />
 </DataSource>
 <DataSource ID="44" type="FileSystemDocumentStore" location="C:\Triaster\Documents\Forms\" query="http://TriasterTest1/Documents/Forms/" uniqueColumn="" resultURL="" ExtensionData="@@True@">
 <t:Categories>
 <t:Category>Forms</t:Category>
 </t:Categories>
 <t:Locations />
 </DataSource>
 etc.
 

In this example, the general 'Documents' and specific 'Forms' categories are depicted. Others for 'Scripts' and 'Work Instructions' would be analogous to that of 'Forms'.

Each document root location is represented by a DataSource. Each DataSource includes:

  • ID - a unique number. At the top of the file, there's a LastUsedID setting that should equate to the highest used ID.
  • Location - a file path that represents the root location to index. The terminating '\' is required.
  • Query - a URL to that indexable location. The terminating '/' is required.
  • ExtensionData - @@@
  • Category - the tab(s) on which search results are displayed.

The order in which these DataSources are defined is important if common locations are involved. DataSources are indexed sequentially, with the indexing information generated by subsequent DataSources superseding that of previous ones if common locations are indexed by different DataSources. The above configuration for indexing general documents has exclusions, as it's futile to associate documents with one category when a subsequent indexing action is going to change that. Even without exclusions, documents in the sub-folders associated with the more specific categories wouldn't be displayed on the general 'Documents' tab. The initial linking of documents in the 'Forms' folder with the 'Documents' category is supplanted by the association with 'Forms'. How documents can be displayed on more than one tab is described later.

Perhaps the two main points to note in this scenario are:

  • The locations of the more specific types of document reside within the general documents location.
  • The 'Documents' configuration excludes the other locations when indexing this DataSource, as that index information would otherwise be overwritten when indexing the subsequent, more specific DataSources.

Scenario 2

A possibly neater implementation would be to have completely separate root folders for all categories of document.

\Documents\
Forms\
Scripts\
Work Instructions\
General\

The configuration in 'indexableSourceRecord.xml' would then be:


  <DataSource ID="43" type="FileSystemDocumentStore" location="C:\Triaster\Documents\General\" query="http://TriasterTest1/Documents/General/" uniqueColumn="" resultURL="" ExtensionData="@@True@">
 <t:Categories>
 <t:Category>Documents</t:Category>
 </t:Categories>
 <t:Locations />
 </DataSource>
 <DataSource ID="44" type="FileSystemDocumentStore" location="C:\Triaster\Documents\Forms\"query="http://TriasterTest1/Documents/Forms/" uniqueColumn="" resultURL="" ExtensionData="@@True@">
 <t:Categories>
 <t:Category>Forms</t:Category>
 </t:Categories>
 <t:Locations />
 </DataSource>
 etc.
 

The difference here is that the general documents have their own unique root folder that doesn't contain the more specific categories of document. Exclusions are unnecessary, so there isn't the danger of excluding documents from searches inadvertently. However, if your document repository already has an established folder structure, this may not be practicable.

Displaying search results

So far, one set of indexing information (as defined by a DataSource) has been associated with one search tab (as defined by a DataSource's Category). That means that search results for particular documents will be displayed on one search tab only. If you wanted the 'Documents' tab to show results for all documents, not just those that aren't associated with a certain type, then each of the document type DataSources would need to include 'Documents' as one of its categories.


  <DataSource ID="44" type="FileSystemDocumentStore" location="C:\Triaster\Documents\Forms\" query="http://TriasterTest1/Documents/ Forms/" uniqueColumn="" resultURL="" ExtensionData="@@True@">
 <t:Categories>
 <t:Category>Documents</t:Category>
 <t:Category>Forms</t:Category>
 </t:Categories>
 <t:Locations />
 </DataSource>
 etc.
 

Summary

Documents can be categorised, and these categories searched separately or in combination with others. Defining such categories is highly flexible, and relatively simple when using existing ones as examples.

Register to receive product release notifications

SIGN UP FOR CONNECTOR

Sign up for Connector
Industry best practice and knowledge in our ‘best of breed’ newsletter.
Published quarterly.

Signup here