Commodore Commodore
Commodore
655 views

Matching OpC HBP Internal Messages - Performing only one Automatic Action on the Management Server

Jump to solution

Dear Experts,

 

I have noticed when matching internal 'OpC' messages by using the OPC_INT_MSG_FLT=TRUE on both the management server and managed node(s).

 

Then creating a msgi policy type on the management server then matching the Communication Broker and the failed ping packages messages.

 

For example matching:

 

Message group: OpC
Application: HP Operations Manager
Object: ovoareqsdr (Request Sender)
Severity: Critical
Text: OV Communication Broker (ovbbccb) on node <Node FQDN> is down. (OpC40-1913)

 

And:

 

Message group: OpC
Application: HP Operations Manager
Object: ovoareqsdr (Request Sender)
Severity: Critical
Text: Node <Node FQDN> is probably down. Contacting it with ping packages failed. (OpC40-436)

 

Message Key used for correlation: <$MSG_OBJECT>:<$MSG_NODE>:<$MSG_GRP>

 

That the automatic actions are re-executed. Even when the correlation keys on the conditions have been set and the duplicates are showing against the existing message.

 

Checking the annotations tab in the Java GUI shows that it has only executed the automatic action once [as no additional annotations] yet it has in-fact re-executed the same automatic action on each duplicate message received.

 

This has been confirmed by outputing the automatic action script to a file and verifying the time executed against the duplicate message received.

 

By default the variable for continuous Heart Beat Polling (HBP) [OPC_HBP_CONTINOUS_ERRORS] is set to false.

This variable [OPC_HBP_CONTINOUS_ERRORS] has been set to true.


As a result duplicate messages are received based on the heart beat polling interval set per node [send alive packets has also been set to 'true' HBP Type: 0x7].

 

Is there any way that the automatic action for these internally matched messages to be executed once?

 

Duplicate suppression may potentially work, although this is only based on an interval. Once that interval is reached the automatic action would be executed again.

 

Also I would like to know why 'OPC_HBP_CONTINOUS_ERRORS' is by default set to 'false'?

 

As with this off, a lot of agents were identified in the environment with agent issues that were not picked up unless the OM services on the management server are restarted (or when performing an offline backup). This was also apparent with agents buffering messages.

 

Essentially only picked up on a 'mass' poll which the management server performs on the services startup.

 

This is a beneficial feature and should be widely documented (not just in the variable definition document as it has been, it should be in the main manuals as well).

 

With this setting off the only other way to identify issues is performing a full node health ITOCHECKER report, performing an opcragt <FQDN of node> or bbcutil -ping <FQDN of node>.

 

This behaviour has been tested on the following HP Operations Manager for Linux/UNIX versions:

 

HP Operations Manager for Linux 9.10.210

HP Operations Manager for Linux 9.10.230

HP Operations Manager for Linux 9.10.240

 

Looking forward to the response.

 

Labels (1)
0 Likes
1 Solution

Accepted Solutions
Micro Focus Expert
Micro Focus Expert

Hello,

 

Local automatic actions are by default always run immediately. But there is a little trick to avoid that.

In the policy condition, in the Advanced Tab, enable output to Server MSI, Then check "divert messages".

There will be a new checkbox "Immediate Local Automatic Actions", keep that un-checked.

 

 

 

 

Now, the agent will not start the automatic action right away, but send the message to the management server

and the management server will start the action. And now, if the message was suppressed as duplicate,

the action will not be started.

 

Best regards,

Tobias

View solution in original post

9 Replies
Micro Focus Expert
Micro Focus Expert

Hello,

 

There are several ways on how to enhance HBP polling with OMU parameters.

Please, check this KM doc for it:

http://support.openview.hp.com/selfsolve/document/KM112191

 

For example, set OPC_HBP_DOUBLE_CHECK TRUE

ovconfchg -ovrg server -ns opc -set OPC_HBP_DOUBLE_CHECK TRUE

set OPC_RECHECK_AGENT_ALIVE_OVCD_DOWN TRUE

ovconfchg -ovrg server -ns opc -set OPC_RECHECK_AGENT_ALIVE_OVCD_DOWN TRUE

 


To find out what is the interval on all your nodes, run on OMU server:

opchbp -all

ovconfget

ovconfget -ovrg server

(attach results as files here)

 

 

Can also attach the policy definition which performs your Automatic Actions on the received matched messages?

 

Is there MoM used? (since you mentioned several OMU patches)

 

 

 

Micro Focus Support
If you find that this or any post resolves your issue, please be sure to mark it as an accepted solution.
If you liked it I would appreciate KUDOs. Thanks
0 Likes
Commodore Commodore
Commodore

Hi Vladislav,

 

Thanks for your input.

 

I am already aware of the knowledge document you have suggested as well as the variables you have suggested.

 

The problem I am experiencing is nothing to do with receiving false positives. These are 'genuine' down events being received. I am doing this in a test environment where I am taking a node down/offline as well as the agent, etc.

 

We rely on these events to create tickets and SMS'.

 

My question is how to stop the automatic action being executed on every polling interval (i.e. every duplicate alert received on poll) as a result it re-executes the automatic action on each duplicate event received (the annotation does not reflect this though). Even when the correlation key is set.

 

Q. Can also attach the policy definition which performs your Automatic Actions on the received matched messages?

 

A. I will create a screenshot showcasing the condition and what is being performed

 

Q. Is there MoM used? (since you mentioned several OMU patches)

 

A. No Manager of Managers (MoM) is set up in this environment just a stand-alone management server with some controlled/managed node(s).

 

I will create some screenshots to demonstrate this issue.

0 Likes
Micro Focus Expert
Micro Focus Expert
Actually, there shouldn't be any AA executed on duplicate messages (if I understood you right).

Unless you have this variable defined on OMU:
OPC_EXEC_AA_WITH_DUPL_SUPPR=TRUE
(default value is FALSE)

OPC_EXEC_AA_WITH_DUPL_SUPPR
Description : Allows the server to execute a remote automatic action for
each duplicate message when count and suppress duplicates is
enabled.
Type/Unit : boolean
Default : FALSE


So we need to see what is set on your OMU server.




Micro Focus Support
If you find that this or any post resolves your issue, please be sure to mark it as an accepted solution.
If you liked it I would appreciate KUDOs. Thanks
Commodore Commodore
Commodore

Hi Vladislav,

 

Thanks for providing that variable.

 

I have checked the OPC namespace.

 

The only ones set are as follows:

 

[opc]
DATABASE=OVOUD
OPC_BBCDIST_RETRY_INTERVAL=600
OPC_HA=FALSE
OPC_HBP_CONTINOUS_ERRORS=TRUE
OPC_INSTALLATION_TIME=12/11/13 00:00:00 (changed to be ambiguous)
OPC_INSTALLED_VERSION=09.10.240
OPC_INT_MSG_FLT=TRUE
OPC_MGMTSV_CHARSET=utf8
OPC_MGMT_SERVER=omserver (changed to be ambiguous)
OPC_SUPPRESS_OUTAGE_BEFORE_MSI=TRUE
OPC_SVCM_ADD_WARN_IF_EXISTS=TRUE
OPC_SVCM_ERROR_CHECKING=FULL

 

I will try and set it anyway.

 

Performing the following:

ovconfchg -ovrg server -ns opc -set OPC_EXEC_AA_WITH_DUPL_SUPPR FALSE

 

After performing this it now shows that entry:

 

[opc]
DATABASE=OVOUD
OPC_BBCDIST_RETRY_INTERVAL=600
OPC_EXEC_AA_WITH_DUPL_SUPPR=FALSE
OPC_HA=FALSE
OPC_HBP_CONTINOUS_ERRORS=TRUE
OPC_INSTALLATION_TIME=12/11/13 00:00:00 (changed to be ambiguous)
OPC_INSTALLED_VERSION=09.10.240
OPC_INT_MSG_FLT=TRUE
OPC_MGMTSV_CHARSET=utf8
OPC_MGMT_SERVER=omserver (changed to be ambiguous)
OPC_SUPPRESS_OUTAGE_BEFORE_MSI=TRUE
OPC_SVCM_ADD_WARN_IF_EXISTS=TRUE
OPC_SVCM_ERROR_CHECKING=FULL

I will wait for the results.

 

In the mean time I have now attached the policy settings (policy_settings.png) and the message matched (alert_matched.png).

0 Likes
Commodore Commodore
Commodore

Hi Vladislav,

 

That did not work just setting the variable to false.

 

I will try performing a management server service restart.

 

Update:

 

I performed a full restart (opcsv -stop, opcagt -kill -> then opcsv -start, opcagt -cleanstart -> all services are running) the message exists and for each duplicate it is still executing the automatic action.

 

Does it make a difference that this interceptor policy is running from the Management Server and is executing the shell script from the management server? As that variable did mention 'remotely'.

 

Since this policy is for an OpC internal message override.

 

Is there anything else I can check?

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello,

 

Local automatic actions are by default always run immediately. But there is a little trick to avoid that.

In the policy condition, in the Advanced Tab, enable output to Server MSI, Then check "divert messages".

There will be a new checkbox "Immediate Local Automatic Actions", keep that un-checked.

 

 

 

 

Now, the agent will not start the automatic action right away, but send the message to the management server

and the management server will start the action. And now, if the message was suppressed as duplicate,

the action will not be started.

 

Best regards,

Tobias

View solution in original post

Commodore Commodore
Commodore

Hi Tobias,

 

Thanks for the awesome tip 🙂

 

How did you manage to find out this trick? I haven't seen this documented anywhere.

 

I was vaguely thinking about MSI (except its not used in this environment).

 

Even then there is not much detail covered in documents surrounding the use of MSI.

 

I would have not picked up on using this exact method/setting though.

 

I'm not directly using MSI (Message Stream Interface) in this environment yet this trick appears to work.

 

4 duplicate alerts have been received and it has only executed one automatic action.

 

Does this mean it is diverted indefintely for the duration of the active message?

 

It appears to be it has already been in excess of 15 minutes and still no duplicate automatic action.

 

It appears the suppression works as you mentioned.

 

Thanks.

 

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello,

 

You're welcome. I've been working with this product for a long time and this question has come up before 🙂

 

There is no need to have MSI enabled on the server. This setting causes output to the MSI in divert mode if the MSI is enabled. If not, then nothing happens. So, this is completely safe to use.

 

The whole point is, that this setting tells the agent that the message may be discarded on the server (e.g. because the message may be diverted to ECS) and thus the agent will not start the action immediately. What happens on the server is a completely different story.

 

Best regards,

Tobias

 

Commodore Commodore
Commodore

Hi Tobias,

 

Thanks for providing further insight into how this works and that it is safe to use 🙂

 

Kudos has been given and Solution has been accepted.

 

Thanks.

 

Kindest Regards,

 

Nick.

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.