Data Collection aborted in HP UX
I use OMW 9 with Agent 11.05 .In one of HP-UX node running with 11.05 data collection has stopped for suddenly and started after automatically after 7 hours , Status.Scope has gave below message .
== Fatal Nums Error == 11.05.005 02/28/13 ==
User: root Date: Sun Jun 15 14:34:00
File: /svn-share2/BLR.ovpa.11.00.r/hp1123ipf/hpsw-oa/PA/numsVob/hp/11.0/nums.C Line: 538 Product id: MWA
System: DCBKP1234 B.11.31 ia64
Errno: 0 (Error 0)
Connection to midaemon lost -- check midaemon process and status.mi
== End of Error Msg =============================
**** /opt/perf/bin/scopeux : 06/15/14 23:15:43 ****
scopeux/HP-UX 11.05.005 COLLECTOR BEGIN.
MI: Sun Jun 15 23:15:45 2014
WARNING: mi_lvm_read_vg_info_ext - mi_lvml_get_vginfo Failed for VG /dev/vgtmp
Please suggest why it stopped .
Memory pressure and the virtual set size of the midaemon is excessive which is causing this problem.
Please follow the KM
to fix that.
Eventhough its old post , iusse which i mentioned here is occuring again and again . The same error which i mentioned in status.scope repeated for 3 HP-UX servers . Data collection stopped unexpectedly and started after server reboot . We dont even come to know that scopeux was stopped and we dint get any alert telling scope was aborted or stopped . Agent was processing policies and was sending threshold alerts and other dbspi related alerts without any fail . As you mentioned we cant give 725MB of Memory to Agent , since server is 18GB and many process related to data bases running in it . Already UNIX team telling that "opcmona" taking high memory out of top ten process .
Issues happended to 3 serevrs and we came to know after 15 days !!!!!!!
Also as mentioned in knowledge base doc "status.mi" dont have mentioend error .
It looks strange that data collection stopped in node and we dint get information in OMW server .Its not at all possible to check each and every server abt data collection every day !!!!!
We already raised a case with HP . Still what can we do to get atleast an alert telling data collection stopped ?
Process and service policiy i have configured for scopeux but automatic command "opt/OV/bin/opcagt -start" not worked . Scopeux was stopped even after sucessfull execution of command !!!!!
Please suggest anbody if you have better solution . Itd very critical isssue which we are facing today !!!!
Dear Experts ,
I need to monitor SCOPEUX process and if it stops i need to get critical alert and automatic action should execute to satrt SCOPEUX in node .
For this i have created a service/Process monitoring policy and its sent alert also when SCOPEUX was stopped .
But automatic action mentioned "/opt/OV/bin/opcagt -start" failed to start SCOPEUX process .
Please suggest automatic command which will start the process if it stopped ..
Please refer the following documents for planning to start the midaemon,
Additional to this, I would recommend newer agent version 11.1x for the midaemon stability.