Thom_Fitzpatric

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2007-06-20
11:14
852 views
How to set OVO heartbeat polling timeout value?
I keep getting the message "Failed to contact node xyz with BBC. Probably the node is down or there's a network problem."
I get these for various nodes at random; usually they're Solaris zones throwing the problem. I've validated that the node is indeed up, and every single time I get the message and bbc ping the node, it's OK.
What I'd like to do is extend the timeout value used when polling the node, to increase the chance of it actually successfully connecting.
How?
I get these for various nodes at random; usually they're Solaris zones throwing the problem. I've validated that the node is indeed up, and every single time I get the message and bbc ping the node, it's OK.
What I'd like to do is extend the timeout value used when polling the node, to increase the chance of it actually successfully connecting.
How?
All your base are belong to us
3 Replies
Daav

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2007-06-21
02:50
Hi Tom,
check the OVOW online help for information on this topic and search for nodeinfo. This is how we do it because we don't allow ICMP pings:
Server settings (registry MsgActSrv):
DISABLE_ACTIVE_PING_HEALTH_CHECK "TRUE"
HEALTH_CHECK_INTERVAL "290"
HEALTH_CHECK_MSG_SEVERITY "Critical"
Agent settings (opcinfo or nodeinfo):
OPC_COMM_TYPE RPC_DCE_TCP
OPC_RPC_ONLY TRUE
OPC_DO_HBP_ON_AGENT FALSE
OPC_HBP_INTERVAL_ON_AGENT 140
We thus have TCP using RPC checking only. The agents send an alive packet every 140 seconds and the server checks if there is a packet received from each node every 290 seconds. 290 / 140 = 2.x meaning the agent has 2 changes of notifying the management server that it's still alive.
Regards,
David
check the OVOW online help for information on this topic and search for nodeinfo. This is how we do it because we don't allow ICMP pings:
Server settings (registry MsgActSrv):
DISABLE_ACTIVE_PING_HEALTH_CHECK "TRUE"
HEALTH_CHECK_INTERVAL "290"
HEALTH_CHECK_MSG_SEVERITY "Critical"
Agent settings (opcinfo or nodeinfo):
OPC_COMM_TYPE RPC_DCE_TCP
OPC_RPC_ONLY TRUE
OPC_DO_HBP_ON_AGENT FALSE
OPC_HBP_INTERVAL_ON_AGENT 140
We thus have TCP using RPC checking only. The agents send an alive packet every 140 seconds and the server checks if there is a packet received from each node every 290 seconds. 290 / 140 = 2.x meaning the agent has 2 changes of notifying the management server that it's still alive.
Regards,
David
Daav

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2007-06-21
03:23
Hi Thom,
I see now that your experiencing BBC problems and are thus using OVOU. Then my answers will not really help.
However, I assume some of the settings are still valid, like the polling and sending alive packet parameters, so in the OVOU documentation you must be able to find something about this.
Cheers,
David
I see now that your experiencing BBC problems and are thus using OVOU. Then my answers will not really help.
However, I assume some of the settings are still valid, like the polling and sending alive packet parameters, so in the OVOU documentation you must be able to find something about this.
Cheers,
David
Esh

Cadet 1st Class
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-12-11
11:12
Hi Dev, Could you please help me to identify those parameters. HEALTH_CHECK_INTERVAL and etc.. Where I need to check. Its urgent. appriciate the comments with points. Mgmt Srv : OMU 9.0 on HPUX Managed Env : Windows Regards, Easwar