Highlighted
Super Contributor.. TMS Super Contributor..
Super Contributor..
1931 views

Finding the root cause of message buffering

Dear Expert,

 

How can we pinpoint the exact reason of messages being buffered , any logs or any events that can be used to find the same. 

 

 

Regards,

TMS

0 Likes
10 Replies
Acclaimed Contributor.. KAKA_2 Acclaimed Contributor..
Acclaimed Contributor..

Re: Finding the root cause of message buffering

it is always due to communication problem between Manged node and Management server. if this is still happening you can check following things.

lookups (forward and reverse)
telnet to OMW Server on port 383

-KAKA-
0 Likes
Super Contributor.. TMS Super Contributor..
Super Contributor..

Re: Finding the root cause of message buffering

Thank You for information 

 

However because we need to prevent this from happening again , we need to know root cause and resolve the issue. 

 

What could be this connectivity issue , is that 

 

Port not opened

Server down

Server not reachable

Firewall blocked

Service is down 

 

And especailly when both servers are in same vlan , it is very unlikely. Hence we need to find a convincing answer with this analysis. 

0 Likes
Acclaimed Contributor.. KAKA_2 Acclaimed Contributor..
Acclaimed Contributor..

Re: Finding the root cause of message buffering

In that case you can try your luck with support but i doubt as support might ask to reproduce the issue.

-KAKA-
0 Likes
Micro Focus Expert
Micro Focus Expert

Re: Finding the root cause of message buffering

Hello,

 

It depends if the buffering is intermittent or persistent.

 

1. Intermittent buffering

 

The buffering is often intermittent. The agent buffers and the next minute it's fine again.

The agent checks once per minute if it can now reach the management server, that's why the buffering often disappears after a minute.

 

This can have a lot of reasons, many of them also intermittent and thus you can't do much about it. For example:

- This could be caused by temporary network issues (e.g. DNS issues)- The management server is too busy.- The management server is out of file handles. This may especially happen if Reverse Channel Proxies are used.   You could increase the limit of file descriptors, e.g. on Linux:Increase the maximum number of allowed file descriptors by using the limits.conf, as follows:
tail /etc/security/limits.conf
* soft nofile 4096
* hard nofile 4096
This sets the maximum available descriptors for all users as 4096.

 

- other intermittent issues ...

 

 

There is no real solution for this, but for OMU/OML, there is a workaround to double check, so that the heartbeat polling will ignore these intermittent buffering issues and not generate a message:


          Here is an example for how to enable the HBP double
          check including agent buffering errors with the default
          of one retry:
          # ovconfchg -ovrg server -ns opc -set \
            OPC_HBP_DOUBLE_CHECK TRUE
          # ovconfchg -ovrg server -ns opc -set \
            OPC_HBP_DOUBLE_CHECK_DELAY_BUFFER 90

 

See the latest Consolidated Server Patch text or Config Variables Guide for more information about those config settings.

 

 

2. Persistent buffering.

 

There is some communication problem from the agent to the management server.

Check with bbcutil -ping from agent to server:

# bbcutil -ping <mgmtsv>

# bbcutil -ping https://<mgmtsv>:383/com.hp.ov.opc.msgr/

 

Some possible causes:

- If all agents are buffering, then there is probably an issue on the server (ovbbccb or opcmsgrb aborted or hung).

  The best will be to re-start the server processes or even re-boot the management server system.

 

- Certificate issues

 

- Firewall blocks communication

 

- Name resolution (agent cant' resolve management server IP address)

 

- With older agent versions, I've seen cases where a message or data info (license info) was stuck in the message agent

  buffer for a non-existing target.

Check in the config settings for wrong management servers (e.g. no longer existing server as general_licmgr).

Check in the mgrconf policy for wrong or no longer existing management servers

Removing the agent queues and message agent buffer to get rid of the old messages with unreachable target:

# ovc -kill

# rm -f /var/opt/OV/tmp/OpC/*

# ovc -start

 

- ...

 

Best regards,

Tobias

 

Acclaimed Contributor.. KAKA_2 Acclaimed Contributor..
Acclaimed Contributor..

Re: Finding the root cause of message buffering

Hello Tobias,

you mentioned "Config Variables Guide".
where can one find it?

-KAKA-
0 Likes
Goran Koruga Absent Member.
Absent Member.

Re: Finding the root cause of message buffering

You can find it here together with the other manuals:

 

http://support.openview.hp.com/selfsolve/manuals

 

The title is "Server Configuration Variables".

 

Regards,

    Goran

0 Likes
Acclaimed Contributor.. KAKA_2 Acclaimed Contributor..
Acclaimed Contributor..

Re: Finding the root cause of message buffering

I find it under Operations Manager for UNIX. is it not available for OMW? -KAKA-

0 Likes
Goran Koruga Absent Member.
Absent Member.

Re: Finding the root cause of message buffering

Hello.

 

Yes, Tobias mentioned this is for OMU/OML.  I have no idea if similar settings/documents exist for OMW, sorry.

 

Regards,

    Goran

0 Likes
Absent Member.. MohanSekar Absent Member..
Absent Member..

Re: Finding the root cause of message buffering

Hello Tobias,

 

I have all the communication happening between the servers.

 

But still I face this issue.

 

To describe further,

 

Few of the managed nodes had the same ovcoreid.

 

So agent has been reinstalled and changed the core id manually.

 

And restarted the agent.

 

Certificate got triggered automatically.

 

Certificate installation was successful in the management server.

 

But still it throws message agent bufferring error.

 

Kindly advise if there is any work around for this.

 

Thanks

 

Mohan

0 Likes
Micro Focus Expert
Micro Focus Expert

Re: Finding the root cause of message buffering

Hello Mohan,

 

Did you also correct the OvCoreID in the mgmt server DB (on OML you would do that with opcnode -chg_id, I think in OMW you can change it in the Node editor).

 

Is name resolution OK?

Can the managed node resolve it's own node name

Can it resolve the OM managment server node name)?

 

 

In case there is still data from before the OvCoreID change, I would clear the temp files:

# ovc -kill

# rm -f /var/opt/OV/tmp/*

# rm -f /var/opt/OV/tmp/OpC/*

# ovc -start

 

 

Does bbcutil -ping work:

Check with bbcutil -ping from agent to server:

# bbcutil -ping <mgmtsv>

# bbcutil -ping https://<mgmtsv>:383/com.hp.ov.opc.msgr/

 

Best regards,

Tobias

 

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.