Big news! The community will be moving to a new platform April 21. Read more.
Big news! The community will be moving to a new platform April 21. Read more.
Ensign Ensign
Ensign
1361 views

UCMDB processing queue very large number

Hi,

We have problem with large processing queue not being processed and job status were not updated. See a part of the log from UCMDB server below.

Any suggesting where I can start look at? Also, is this "processing queue" same as the probe's "sending queue"?

 

2018-04-21 10:18:35,311  INFO   [Process Results Thread-HostReDiscover-PowerShellTest] - Processing result of 'HostReDiscover-PowerShellTest' from probe: 'IMMI.PROD.05(PDCWPIVAS34)' took 2880msec. Waiting time (in result processing queue): 21077839 .Statistics: [Total objects: 160, Total links: 347] Objects - [Input: process(26) ip_service_endpoint(120) nt(1) websphereas(1) running_software(2) ip_address(5) interface(2) unix(3) client_server(80) containment(5) composition(151) dependency(16) usage(95) ] [Added: process(6) ip_service_endpoint(68) client_server(79) composition(74) dependency(6) usage(94) ] [Updated: process(12) ip_service_endpoint(5) running_software(2) unix(1) ] [Removed: 0]     Links   - [Input: 0] [Added: 0] [Updated: 0] [Removed: 0]

0 Likes
13 Replies
Admiral
Admiral

We fixed a lot of sql commits in 10.33 CUP1. What is your version?

Kind regards,
Bogdan Mureșan

EMEA Technical Success
0 Likes
Ensign Ensign
Ensign

It is 10.22 cup3

0 Likes
Ensign Ensign
Ensign

I also want to understand how the queues work. and how UCMDB gets the result. Is there a message queue somewhere on the probe and server? Where can I see the queues?

0 Likes
Admiral
Admiral

There were some sql commits that weren't so efficient and we managed to fix them (hopefully all of them) just in 10.33 CUP1. Some code changes were applied even on 10.22 CUPs.

From a performance point of view, you're not on a fortunate version. I remember that at one point I counted over 100 customer visible fixed defects between CUP2 and CUP6, it's all in the release notes.

The probes will retry to send the CI bulks until the server acknowledges them. There is also a TTL for those bulks so they will age out on probe side eventually.

Also, it matters how your discovery schedule is. hopefully, you don't do everything at the same time and on a daily basis.

Kind regards,
Bogdan Mureșan

EMEA Technical Success
0 Likes
Ensign Ensign
Ensign

Thanks for the insight.

I can confirm that the probe keeps sending results to server and acknowledged. sample log below. However, from the probe status, there are still may results not being sent. For 5148(ms i assume), I don't know if it is slow or not.

And I don't know wherther or not server somehow stop processing it or slowly processing it. Is there a way to check?

<2018-04-27 10:34:08,460> 41375610 [INFO ] (TaskResultsSenderThread.java:162) - Finished sending results to server. time: 5148
<2018-04-27 10:34:08,460> 41375610 [INFO ] (TaskResultsSenderThread.java:167) - Process Result Time Statistics - Total Time:5148, Results size:200, Time To get Tasks:187,
<2018-04-27 10:34:23,810> 41390960 [INFO ] (TaskResultsSenderThread.java:162) - Finished sending results to server. time: 5163
<2018-04-27 10:34:23,810> 41390960 [INFO ] (TaskResultsSenderThread.java:167) - Process Result Time Statistics - Total Time:5163, Results size:200, Time To get Tasks:172,
<2018-04-27 10:34:39,130> 41406280 [INFO ] (TaskResultsSenderThread.java:162) - Finished sending results to server. time: 5148
<2018-04-27 10:34:39,130> 41406280 [INFO ] (TaskResultsSenderThread.java:167) - Process Result Time Statistics - Total Time:5148, Results size:200, Time To get Tasks:156,
<2018-04-27 10:34:54,449> 41421599 [INFO ] (TaskResultsSenderThread.java:162) - Finished sending results to server. time: 5164
<2018-04-27 10:34:54,449> 41421599 [INFO ] (TaskResultsSenderThread.java:167) - Process Result Time Statistics - Total Time:5164, Results size:200, Time To get Tasks:140,
<2018-04-27 10:35:09,815> 41436965 [INFO ] (TaskResultsSenderThread.java:162) - Finished sending results to server. time: 5210
<2018-04-27 10:35:09,815> 41436965 [INFO ] (TaskResultsSenderThread.java:167) - Process Result Time Statistics - Total Time:5210, Results size:200, Time To get Tasks:141,

0 Likes
Admiral
Admiral

I see 2 possible scenarios for you:

  1. hold on a little bit and wait for 10.22 CUP7, you will squeeze a little bit of stability and performance. It's a 1 step update
  2. be patient the same amount time and wait for the 10.33 CUP2 release. This will imply a 2 step upgrade (to 10.33 and after that deploy CUP2) and this will allow you to upgrade the Postgresql DB to 9.4.17. I was amazed yesterday about the improvement on this side. Your licenses will be valid for 10.33 if I remember correctly but you can ask this via Support. For 10.3x you will benefit from the multithreaded identification process which in worst case scenarios is just twice as fast.
Kind regards,
Bogdan Mureșan

EMEA Technical Success
0 Likes
Cadet 3rd Class
Cadet 3rd Class

thanks for the hint.

I was having the same issue, after applying 10.33 CUP 4(UCMDB_00228) I see the data in performance improved significantly.

0 Likes
Commander
Commander

Hello,
Have you considered increasing the number of global threads that the probe may have available, and of course redistributed the additional threads in active jobs?

kind regards
0 Likes
Fleet Admiral
Fleet Admiral

@florian_miceli the issue here is not that the probe is discovering slowly, but that UCMDB cannot cope with the queue of sent results fast enough. The performance bottleneck is the UCMDB. 

Likes are appreciated!

Hello together,

currently I have the same situation described above.

Our script tries to send around 5 Million CIs but the sending stops after 1.5 Million CIs. That means our queue has 3.5 Million CIs in the queue and the size of the queue does not change for days. Even a restart of the probe and the server did not force the probe to send any chunk.

Is there any way to force the probe to send an unsend chunk again? How can we debug the result processing / sending?

The only error message / warning I can see in the error.log file is the following:
Contained a(n) error on [type=ip_address, External Id=unknown, DisplayLabel=192.168.72.2, Temp Id=$TempObject123456]. His anchor id {null} is missing in OSH Vector, This may cause incorrect results in repopulate flow.

Even if the data contains an error I would expect an error in a certain amount of time. But that the data is not processed at all is a very bad situation!

Best regards,

 

 

Michael

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello Michael,

 

I presume that these CIs are from an integration, right?

Can you provide more details? What CMS version are you using?

 

 



Kind regards,
Bogdan Mureșan
EMEA CMS Technical Success
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.