NNM Service Health Agent Errors

NNM Service Health Agent

Service                                                                                                                                                            State

Com.hp.ov.nms:service=NmsModel                                                                                       Started

Com.hp.ov.nms.as.shared:service=NmsAuthService                                                      Not Found

Com.hp.ov.nms.as.shared:service=NmsCrlManager                                                       Not Found

Com.hp.ov.nms.as.shared:service=NmsTrustManager                                                  Started

Com.hp.ov.nms.geo.bridge:name=BridgeManager                                                         Started

Com.hp.ov.nms.geo.bridge:name=RequestCache                                                           Started

Com.hp.ov.nms.topo:service=DataBaseMaintenance                                                    Started

Com.hp.ov.nms.topo:service=KeyManager                                                                        Started

Com.hp.ov.nms.topo:service=NetworkApplication                                                          Started

Com.hp.ov.nms.topo:service=customexport:mbean=CustomExportService        Started

Com.hp.ov.nnm.security:service=TrustManager                                                                               Started

 

 

The two highlighted are the ones I am having problems with. They are giving a Critical Health error on the NNM Cluster. Intergraded with OML and Sitescope.

  • I am running NNMi 9.23 in HA mode. Instead of highlighted I mean NOT STARTED.

  • Hello,

     

    Does this problem happens right away after startup or NNMi runs for a while without these

    errors ? If it is startup problem the hints should be in boot.log. If NNMi can be stopped,

    I would do that, rename boot.log in NNMi log directory, start NNMi, and then beginning from the

    top scroll down and  look for first 2-3 errors/exceptions . That might give some ideas.

    Also, make sure there are no older renamed files in jboss deployment directories NNMInstallDir/OV/NNM/server/deploy,

    NNMInstallDir/OV/NNM/lib. All files in that directories will be deployed.

     

     

     

  • Hello drseymo,

     

    Are you running NNMi 9.23 on Windows 2008 ? I've seen a related issue on a previous case. If you're running Windows 2008 please check for a service named Application Experience and set that service to Manual then see if the problem happens again.

     

    Also, as Sergey suggested, please also check the boot.log and nnm.log files.

     

    BR,

    Roberto

  • Colleagues,

    I am currently running OS Linux 5.10 (Tikanga). I compared the /OV/NNM/lib on the server in question with a known good one (they matched). Rebooted NNMi and had no change. Still not finding those 2 services. I will check the logs next.

  • Colleagues,

    Looked at the /var/log/boot.log and there are no entries. Any other place you want me to look?

  • Hello,

    I do not think Sergey meant for you to look at /var/log/boot.log, but /var/opt/OV/log/nnm/boot.log (the NNMi software has its own boot log).

    Since this is setup in a cluster, first put the affected node in maintenance mode.

    Then stop the NNMi Services (ovstop -v).

    Then check if there are any zombie NNMi services still running (use the ps command, and grep it for terms like "OV", or "nnm", or "perl", or "postgres" and so on).

    After this, have a tail -f on the boot log, and start NNMi (ovstart -v), record the session if possible, and then from the time you executed ovstart, read the boot log. I would expect messages like say..."Interface Repository Deprecated", or search by words like "WARN*", or "FATAL", or "SEVERE" and so on.

     

    After all services have started, execute "ovstatus -v" and post the output, and attach the boot.log from the NNMi log directory (if you are ok with posting it here).