Lieutenant Lieutenant
Lieutenant
667 views

Cluster-aware monitoring

Hello,

 we are experiencing following issue. In our OML9 we have virtual node defined by cluster group and 2 physical servers . According to our experinece we expect that the deployed policies will be automatically disabled on the inactive (OFFLINE) node. Although this works on other cluster environments we have one case where the setup does not work properly. The cluster software used in all cases is VCS (Veritas Cluster Server). The OS is RHEL Linux.

 

Thanks

Jan

Tags (3)
0 Likes
12 Replies
Absent Member.. Absent Member..
Absent Member..

Hello,

 

Could you please provide more details as to what exactly is not working properly?

 

Best regards,

HP Support
If you find that this or any post resolves your issue, please be sure to mark it as an accepted solution.
If you liked it I would appreciate KUDOs.
0 Likes
Micro Focus Expert
Micro Focus Expert

Hello Jan,

 

In order for the Cluster Awareness (Claw) to disable and enable the policies, the policies need to be assigned to the virtual node.

 

Make sure the policies are only assigned to the virtual node. If a policy is assigned to the physical node and the virtual node, it will stay enabled.

 

You can check on the agent if a policy was assigned to a HARG using ovpolicy:

# ovpolicy -list -level 4

...


  monitor           "distrib_mon"               enabled    0009.0000
    policy id      : "6ac5c3bc-e455-11dc-808e-00306ef38b73"
    owner          : "OVO:tobias"
    category (1): "examples"
    attribute (1) : "product_id" "ovoagt"
    attribute (2) : "checksum_header" "73216b61c54e950fc852c7d8292de7c408b8cadf"
    attribute (3) : "version_info" ""
    attribute (4) : "version_id" "6ab89412-e455-11dc-9d57-00306ef38b73"
    attribute (5) : "HARG:ov-server" "no_value"

 

Policies that were assigned to a virtual node have an attribute with the name "HARG:<HARG-name>".

In my case the HARG name is ov-server.

 

 

Next, please check that the cluster was detected on the managed node:

# ovclusterinfo -a

 

If it returns an error, the cluster version may not be supported or recognized by the used agent version.

 

What agent version are you using?

# opcagt -version

 

Best regards,

Tobias

 

0 Likes
Lieutenant Lieutenant
Lieutenant

Hello Tobias,

I did the checks as suggested and here is the output.

 

ovpolicy -list -level 4 (showing only 1 policy in the output below)

 

      * List installed policies for host 'localhost'.

     Version            Status
 --------------------------------------------------------------------
  configfile        "GBL_Linux_OA_ParmPolicy"   enabled    1100.0001
    policy id      : "1ea23bbc-5a40-71e2-0559-17fc19cb0000"
    owner          : "OVO:DHL_PRG_OML"
    category (1): "HPOpsAgt"
    attribute (1) : "product_id" ""
    attribute (2) : "checksum_header" "12896ac3017c7a55daed46300f1abc175ce278ed"
    attribute (3) : "creation_date" "1316647563"
    attribute (4) : "creation_user" "MSSPINT12\Administrator (MSSPINT12)"
    attribute (5) : "version_info" ""
    attribute (6) : "version_id" "1ea63c6c-5a40-71e2-0559-17fc19cb0000"

  configsettings    "OVO settings"              enabled    1
    policy id      : "a1b6413e-f15e-11d6-83d0-001083fdff5e"
    owner          : "OVO:DHL_PRG_OML"
    category     : <no categories defined>
    attribute     : <no attributes defined>

 

ovclusterinfo -a
ERROR:    (conf-599) Cluster exception.
          (conf-236) Can not get the state of the local node.

opcagt -version
11.02.011

I did not find HARG in the output of the first command and also the command ovclusterinfo -a resulted in an error.

 

Thanks

Jan

 

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello Jan,

 

Yes, HARG is not defined. That means that the policy was either assigned only to the physical node(s) or it was assigned to both, the physical and virtual nodes. To check what a policy is assigned to, you can go to the policy bank or All policies in the AdminUI and in the View menu select "Direct node(group) assignments".

 

The agent version on the managed niode is pretty old. Possibly that agent version doesn't support the cluster version. To check what HA version is supported, you can check SUMA (support matrix):

http://support.openview.hp.com/selfsolve/document/KM323488

 

Select the HP Operations Agent as product and then go to the High-Availabilty section.

 

Best regards,

Tobias

 

 

 

0 Likes
Lieutenant Lieutenant
Lieutenant

Hello,

 the policies were directly assigned only to the virtual node. No policies were assigned directly to the physical ones.

 

 Interesting thing is that we are using the same setup on the same version of agent elsewhere and the cluster-aware setup works perfectly (although the command ovclusterinfo  -a fails there with the same error).

 

The only difference I could spot was that the problematic cluster runs on RHEL 6.1 while the one where all works fine is RHEL 6.3. But I cant believe that this could be the root cause...

 

Update (10:30am)

We have recently experienced an issue also on the "working cluster". When one of the nodes failed and the cluster was failed over, the policies were automatically enabled on the active node, but the policies on the inactive nod were not DISABLED ->  resulting into fake alerts. Could this somehow be related to the fact that command ovclusterinfo  -a does not work?

 

Regards

Jan

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello Jan,

 

I think there are two independent issues:

1. Cluster awareness (Claw) doesn't work

This is indicated by "ovclusterinfo -a" failing. If ovclusterinfo -a doesn't work, then enabling policies and disabling policies probably won't work, either. I'm surprise to hear that it does.

 

And the reason why it's failing is probably because the cluster is not supported by your agent version.

 

2. No HARG in ovpolicy output

Regardless weather Claw works or not, the HARG attribute should be present.

Perhaps you did not specify a HARG (cluster package) for the virtual node in the node bank.

 

Best regards,

Tobias

 

0 Likes
Lieutenant Lieutenant
Lieutenant

Hello,

 

ad 1. Is there something I can do (change config) to make the command work?

Regarding your point that cluster is not supported by our agent. Is there an agent that supports this configuration?

 

[root@czstlls069 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.1 (Santiago)

[root@czstlls069 ~]# /opt/VRTSvcs/bin/had -version
Engine Version    6.0
Join Version      6.0.30.0
Build Date        Fri 11 Jan 2013 01:00:01 AM CET
PSTAMP            6.0.300.000-GA-2013-01-10-16.00.01

According to SUMA I have just downloaded there is currently no agent supporting this combination of OS and VCS. Can you confirm?

 

ad 2.  We have lot of cluster environments where the monitoring works fine. The problem is just here - we have double checked the OML setup and all seems fine.

 

Thanks

Jan

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello Jan,

 

ad 1. Yes, that's correct. There is currently no agent version that supports that combination.

 

I found an existing Enhancement Request for VCS 6.0.1 support:

QCCR1A169008 Operations Agent support needed on Veritas Cluster 6.0.1 on RHEL 6.4

http://support.openview.hp.com/selfsolve/document/LID/QCCR1A169008

 

You can register for that ER to be notified when it is released.

 

ad 2. Can you please run this command to show if cluster package is defined correctly?

# /opt/OV/bin/OpC/utils/opcnode -list_virtual node_name=<virtual-node-name>

 

Best regards,

Tobias

 

0 Likes
Lieutenant Lieutenant
Lieutenant

# /opt/OV/bin/OpC/utils/opcnode -list_virtual node_name=prgnbu2013
Attributes of virtual node 'prgnbu2013.gcc.dhl.com'
==========
cluster_package=nbumas2013
node_list="prgprod83.dhl.com prgproddr69.dhl.com"
Operation successfully completed.

# /opt/OV/bin/OpC/utils/opcnode -list_virtual node_name=prgnbunfe
Attributes of virtual node 'prgnbunfe.dhl.com'
==========
cluster_package=nfenbumas
node_list="prgprod101.dhl.com prgproddr101.dhl.com"
Operation successfully completed.

 Jan

0 Likes
Micro Focus Expert
Micro Focus Expert

Hello Jan,

 

That looks all good. I would try these things:

1. Distribute with -force update to the virtual node

 

Verify with ovpolicy -list -level 4

 

2. De-assign / re-assign the policy to the virtual node and distribute again

 

Verify with ovpolicy -list -level 4

 

3. If the previous steps fail, you could try this to re-create the resolved assignments:

/opt/OV/bin/OpC/opcdbidx -config

 

That will re-create the resolved node to policy assignments (opc_node_config).

That means the next distribution will re-distribute all the policies when distributing next time even without -force.

 

Best regards,

Tobias

 

0 Likes
Lieutenant Lieutenant
Lieutenant

We have tried step 1 and 2 with no luck. The situation stays the same.

 

Step 3 is something I dont dare to do on a server that monitors our whole production environment. 

 

I will raise enhancement request for this platform combination.

 

Thanks

Jan

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.