With regard to category training:
1) Why does the count in brackets after a repository not always match the count of files ingested?
2) Why are the count of results returned sometimes inaccurate & blank pages are returned?
Subitems are never shown when you browse a repository.
The count shown beside the repository on the category screen includes subitems, however this only happens if during ingestion you selected Analyze subitems.
If you ingest a repository and do not select Analyze subitems, then the count will not include subitems.
The reasoning behind this is that the search screen lets you return parent files or subitems depending on the options you select.
The files shown in the results can also have slightly different icons depending on what you have selected.
You can only search on subitems if you choose to index them during ingestion via selecting Analyze subitems when setting up the repository.
So if you don't want to be able to search them you can always not select Analyze subitems.
In addition if you have selected Analyze subitems, the subitems will still not appear in search results unless you select Match on subitems Options in Category Options.
If you select Match on subitems you see the subitems shown in the search results.
If you don't select Match on subitems then subitems will not be shown in search results.
When subitems are shown they have a slightly different icon to highlight that it is a child of a parent. i.e. a subitem.
In addition the name of the child will also show the parent filename.
So you can control what you see from from using some of the options above.
For example for a repository ingested with Analyze subitems enabled, you could create a category with Match on subitems not selected.
Any search that matches will show the parent files but not the subitems.
Similiarly you could then create another category (or edit the existing category) to enable Match on subitems.
Any search that matches will show the child items but not the parent files.
Category searching returns approximate results due to the default search issued.
A Predict=True is being set to instruct IDOL server to use statistical sampling to estimate the total number of results available.
Queries against metastore do not use IDOL, however they can still use Predict=True which would have the same impact in that an approximate result is returned.
This can account for a discrepancy in the count of search results shown. This was implemented this way for performance reasons.
You can disable Predictions by doing the following:
In each content engine.cfg in the [Server] section update TotalResultsPredictionThreshold from its default value of 1000 to 0.
Save change and restart each content engine for the change to take effect.