[Online Expert Day] Smart Analytics (1): How to determine possible Stop Words?

we are just intoducing HP SM 9.41 and Smart Analytics and we have a question regarding the usage:
Using IR there was a way to determine possible stopwords using a command in combination with sm.exe. What is the intended way to determine Stop Words for the IDOL Smart Analytics engine? E.g. for hot topic analytics our results are currently differnt forms of greetings or welcome messages and no real hot topics.

We are aware about the Cleansing and stop word list but we don't know the concrete difference.
Please advice how to determine the stop words and where those should be maintained.

Kind Regards
Alexander Lambertz

  • Hi Lambert.

    hope you are doing well.


    A stop-word list is a list of terms that can be ignored when the search engine is searching or indexing. Typically, stop-word lists include short and common words or prepositions, such as "a," "the," or "with" in English. However, they may also include longer words, such as long number strings, or words that are too common to be useful as search targets, such as the term "internet." Stopwords are removed from words entered in the "Search for" box unless they are enclosed in double quotes (phrase search). They are not removed during indexing to allow for phrase searching.

    Smart Search applies both the Smart Analytics stop words and the Service Manager stop words. However, some stop words used in Service Manager conflict with the stop words logic used in Smart Search. For example, "before" is a stop word defined in Service Manager, which means that "before" in a search string is ignored. However, in Smart Search, it supports the search string as "A BEFORE B", which means that Smart Search returns the results in which A comes before B. If "before" is not removed from the Service Manager stop words list, the returned results are not as expected. To avoid this problem, when you enable Smart Analytics, some of the English stop words used by the Solr search engine are removed. Here is a list of the removed stop words.

    If you want to keep all the Service Manager stop words when you upgrade your Smart Analytics server, back up the stop words list before upgrade and import it after the

    When you search by the IDOL search engine:

    • If the search word is a stop word defined in the Solr search engine, no result is returned.
    • If the search word is a stop word defined in IDOL search engine only, a warning message is displayed.

    Stop words are stored in Service Manager in lists by specific language. Not all languages support stop words (for example, Japanese and Chinese). Adjust the list of stop words by either adding or removing words from this list.

    The stop-word list for your log-in language is used by default, and is loaded once when you first log in. Changing the query language parameter on the advanced search screen changes the stop-word list used. The new stop-word list is loaded each time you search in a language other than your log-in language. This may cause a delay in your search being submitted as the stop-word list is loaded. If you need to perform extensive searches in a language other than your log-in language, HP recommends that you log out and then log back in with the other language to reduce this delay.

    OpenTo modify Service Manager stop words:
    1. Click Smart Analytics > Smart Search > More > Stop Words.
    2. Click Search.
    3. Select the record for the language code you wish to change.
    4. Add a new word or modify an existing word.
    5. Click Save.
    OpenTo modify Smart Analytics stop words:
    1. Open the <Smart Analytics Installation>/lanfiles folder.

      The stop words lists are saved as the <language name>.dat file in this folder.

    2. Open the language file that you wish to change.
    3. Add a new word or modify an existing word.
    4. Click Save.
  • Hello,

    thank you very much for your fast answer!
    The way how to maintain the stopwords is described with this enhanced cite form the official documentation. But there is no information how to determine possible StopWords. Using IR you can get a list by executing -verifyir (I guess). How can I get this list for IDOL. Determine each word one by one from feedback in different languages is no real option.

    Maybe you also have some information regarding this.

    Kind Regards
    Alexander Lambertz