Highlighted
Contributor.
Contributor.
777 views

Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

My environment is HP SM 9.41 Hybrid horizontally scaled with asynchronous ir setup.

Our text search stopped working due to a corrupted IR index for ir.probsummary so I triggered an IR Regen on the probsummary table. This regen added about 70,000 records in the SCIREXPERT table with the filename ir.probsummary but then it got "stuck". In the sm.log file it showed that the IRQUEUE session was getting a signal 11 error and terminating. I tried to manually start the sm -que:ir process multiple times but each time it would stop with the same error message in the logs. Since then i did a full server restart and it seemed like it was processing the entire IRQUEUE table but it wasn't actually putting any entries in the SCIREXPERT table and after completely emptying the IRQUEUE table any text searches just say no incidents found. There are no error messages in the sm.log that I can see that would explain this and I'm not sure what might cause this behaviour. Has anyone else experienced this issue or similar that could help?

0 Likes
13 Replies
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

The signal 11 could be cause by all the factors mentioned in https://softwaresupport.hpe.com/group/softwaresupport/search-result/-/facetsearch/document/KM02161121. I found IR Regen doesn't do well when you start having millions of records in the db.

One remedy is https://softwaresupport.hpe.com/group/softwaresupport/search-result/-/facetsearch/document/KM868376 and do the regen in dev and port it back to prod. Note: not mentioned in KM868376 but is in in other KMs, you can increase the shared memory allocation specifically for the ir regen in dev to help it along since you don't have to worry about production need for share memory.  Also, to save some time, delete the scirexpert in dev yourself before you start. Not sure why but the IR regen deletion of scirexpert takes longer than doing it manually.

Or stop using IR, disable it, free up resources and just use Knowledge Management module. IR is old tech and free but KM module is newer tech and is continuously developed but not free.

0 Likes
Highlighted
Contributor.
Contributor.

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

The signal 11 error is no longer occurring since the restart, the IRQUEUE is being processed it's just being processed incorrectly and not actually creating any entries in the SCIREXPERT table while clearing them from IRQUEUE. Also we don't have millions of records in probsummary, just around 210,000 so I wouldn't expect that to be too much for the IR Regen to handle.

 

I could do the regen in dev but I'm concerned that the IRQUEUE is no longer getting processed correctly at all so any new records being created wont be searchable until i do subsequent regen's in dev each time which would be a real hassle compared to getting it working correctly in production.

I have the shared memory set to 256000000, I increased it when I did the restart since memory isn't really an issue on the server. 

Unfortunately disabling IR and using the KMmodule isn't really an option for us right now since we don't have the license for that module.

Ideally I would like to troubleshoot this further and get the issue resolved, actually not that i look again i see there is an error message straight after the IRQUEUE gets filled.

 

1720( 3908) 10/24/2017 11:10:46 RTE W sqllimit exceeded, user=IRQUEUE limit=5.000 actual=12.594 SQL statement follows
1720( 3908) 10/24/2017 11:10:46 RTE D 23080554: sqmssqlSelect - EXECUTE:SELECT * FROM IRQUEUEM1 READCOMMITTED WHERE "FILENAME"=? AND "KEYINTERNAL"=? AND "COUNTER"=?

Not sure why the sqllimit is 5 seconds, my understanding is that it should default to 30 seconds.

 

0 Likes
Highlighted
Contributor.
Contributor.

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

I've triggered a new IR Regen of the probsummary table while stopping the "sm -que:ir" process then once the IRQUEUE table was populated i manually started the process with the options "sm -que:ir -sqllimit:60 -ir_trace" so I'm not getting the SQLLimit message in the logs this time but it's still doing the same thing where the number of rows in the IRQUEUE table is decreasing without adding any new entries to the SCIREXPERT table.

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

I haven't seen such an error before. 

Do you get the same issue with a different table?

If samething, maybe get a single record of scirexpert unload and see if you can load it into scirexpert to rule out db issue.

0 Likes
Highlighted
Contributor.
Contributor.

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

OK so I've manually run sm -que:ir from the sync server instead of the main primary server in the horizontal scaling environment and it is processing the records correctly, it's been going for about 14 hours and is half way through the 200,000 IRQUEUE records now and has created 440,000 records in SCIREXPERTM1.

It's quite strange though, I have no clue why the behaviour would be so different on one server versus the other. They have identical configuration and software versions including java.

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

Thanks for sharing the solution. This will be going into my notes.

I didn't realised it has to be run from the sync server. I would have ran it from the main primary server too but I haven't done this for a long time.

I did some digging. I suspect since the sync processes locks and IR regen need exclusive locks. That may be the reason.

IR Horizontal Scaling
Since IR Expert indexes are held in shared memory, locks to these indexes have to be communicated between the different machines in horizontal scaling to prevent IR issues. There are two different kinds of access against the IR indexes in shared memory: add/update when a record was added or updated or an IR regen was performed, and IR searches. Exclusive locks are required for any add / update action, meaning no searches can be performed on an IR file while the IR Index is being updated.

0 Likes
Highlighted
Contributor.
Contributor.

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

That's the thing, it's don't think that it is meant to be run from the sync server, at least it wasn't that way on any recommendations I looked at while we setup the environment. I just attempted it since i wasn't sure where to go next with my troubleshooting since it was a frustrating issue. It seems to be working now though so I'm happy about that as it gives me a viable workaround but I'll still need to find out why it wasn't working from the primary server like it should be. 

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

Have you experienced any other issues lately? e.g. to do with load balancing.

I'm not sure if IR Regen has anything to do with JGROUP but the server to server comms are handle via JGROUP. You could give this https://softwaresupport.hpe.com/group/softwaresupport/search-result/-/facetsearch/document/KM1304227 a go to see if the JGROUP comms is affecting your IR regen.

0 Likes
Highlighted
Contributor.
Contributor.

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

Haven't experienced any issues. There are a number of active sessions on connected to both servers at the moment and people are logging/updating incidents/changes/problems without issue as per normal operations. What sort of issues would you expect to be noticed if there were JGROUP problems?

0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

Users running out of connections as load balancer not talking to other servers and not getting real count and not spreading the load. So a servlet may end up being overloaded when other servlets are free. Come to think of it. You can just run sm -reportlbstatus to check on your lb to confirm whether you have JGROUP comms issue.

0 Likes
Highlighted
Contributor.
Contributor.

Re: Issue with search, IRQUEUE gets processed but nothing is added to SCIREXPERT during an IR Regen

Ok below is the output from when I run sm -reportlbstatus.

 

Load Balancer Status:Wed Oct 25 10:51:27 ACST 2017

Retrieving cluster status ...

 

HostName: DRWNT-WS18.ntschools.net

-----------------------------------------------------------------------------Ser

ver Instances-------------------------------------------------------------------

----------

 ProcessID     ClusterAddress     HttpPort  HttpsPort  Sessions  DbgMode  QMode

 LB      State      LowMem     JAVA_USED/MAX/PERCENT       HEAP_USED/MAX/PERCENT

 

    1624      DRWNT-WS18-49596      13080       0       (15/50)     N       N

  N       RUN          N   (37107184/178978816/20.732725)(1239433216/4294836224/

28.858685)

    1208      DRWNT-WS18-36395      13085     13096     (7/50)      Y       N

  N       RUN          N   (31536056/178978816/17.619993)(897257472/4294836224/2

0.89154)

---------------------------------------------------------------------------Non S

erver Instances-----------------------------------------------------------------

----------

 ProcessID     ClusterAddress         State      LowMem     JAVA_USED/MAX/PERCEN

T       HEAP_USED/MAX/PERCENT    Command Line parameters

    3736      DRWNT-WS18-20034         RUN          N   (3818216/178978816/2.133

334)(690311168/4294836224/16.07305)   -que:ir -sqllimit:60 -log:irqueue.log -ir_

trace

    2084      DRWNT-WS18-23813         RUN          N   (3733768/178978816/2.086

151)(677011456/4294836224/15.763382)   -sync

 

HostName: DRWNT-AP18.ntschools.net

-----------------------------------------------------------------------------Ser

ver Instances-------------------------------------------------------------------

----------

 ProcessID     ClusterAddress     HttpPort  HttpsPort  Sessions  DbgMode  QMode

 LB      State      LowMem     JAVA_USED/MAX/PERCENT       HEAP_USED/MAX/PERCENT

 

    1988      DRWNT-AP18-44473      13085     13096     (6/50)      Y       N

  N       RUN          N   (36471136/178978816/20.377348)(964808704/4294836224/2

2.464388)

    1616      DRWNT-AP18-27897      13080       0       (12/50)     N       N

  N       RUN          N   (36156856/178978816/20.201752)(1221799936/4294836224/

28.448114)

---------------------------------------------------------------------------Non S

erver Instances-----------------------------------------------------------------

----------

 ProcessID     ClusterAddress         State      LowMem     JAVA_USED/MAX/PERCEN

T       HEAP_USED/MAX/PERCENT    Command Line parameters

    2152       DRWNT-AP18-2353         RUN          N   (15998016/178978816/8.93

8497)(1012047872/4294836224/23.564295)   system.start

    5156      DRWNT-AP18-20625         RUN          N         (0/67108864/0.0)

    (523440128/4294836224/12.187662)   -reportlbstatus

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.