2849 views

Experts: Filesystem Threshold Overrides

Hi Experts,

 

We have a set of thresholds defined in our filesystem monitoring policy, but very often a support team will want to override these for their server.

 

This is also true of using the MessageGroup parameter.  The default might be to have "UNIX-TEAM" as the default message group, but for other filesystems on a specific server we use the override file facility to direct a filesystem to an application team for example.

 

The problem is that these files/settings are not hierarchical in any way.

 

What would be great if these were inherited/combined at the node level.

 

For example:

 

Default - /usr critical threshold 95%

 

Team X wants /bob set to 99%

 

We than have to copy the master details in the policy (/usr=95%) and then create an override file that has:

 

/usr=95, /bob=99 for example

 

If it inherited from the master policy then we should only need to do:

/bob=99

 

It would make things so much easier.

 

Any thoughts on this?  Or how we can achieve this in a better way?

 

Regards

David Gerrish

protocolsoftware.com - we monitor IT

 

@openview
Labels (1)
7 Replies
Micro Focus Expert
Micro Focus Expert

David,

 

Threshold monitor policies can override a policy threshold by setting a local config variable.  For this to work, first the config variable [eaagt]:OPC_OPCMON_OVERRIDE_THRESHOLD must be set to TRUE.   For example:

 

#ovconfchg -ns eaagt -set OPC_OPCMON_OVERRIDE_THRESHOLD TRUE

 

Then one would set variables in the eaagt.thresholds namespace of the form:

 

 

<name>=<policy name>/<condition description>/<threshold value>:<reset value>

 

where:

name                                 Variable name that must follow the syntax of XPL config variables

                                            and must be unique on the system.

policy name                       Policy name you want to customize.

conditions description    Condition you want to customize.

threshold value                Threshold value you want to use on this system.

reset value                        Reset value (you must define the reset value even if it is the same as the threshold value).

 

For example:

DiskUsage_1=DiskUsage/Critical Threshold/5:10

 

This is described in the "HTTPS Agent Concepts and Configuration Guide"

 

 

These values could even be implemented in a NodeInfo Policy to allow assignment and deployment from the management server.

 

Regards

Account_Closed
Not applicable

Having multiple nodeinfo policies deployed to the node, suffers from the limitation that when the agent is started the order of precedence of xpl config settings (esp for cases where overrides have been placed) is non-deterministic.

 

so using multiple nodeinfo policies with overriding thresholds will not work, in short.

 

however one could use multiple configfile policies, with different sets of thresholds in these, and a different configfile policy to keep priorities/precedence where there are conflicts. on deployment, a script will run (standard feature of config file policies) and this script will read all the config files, read in precedence and run appropriate xpl config settings. the xpl config settings could also be saved to a file for review later.

 

let me know if i am (not) making sense.

 

of course - all of this can be avoided, if we use OMi and monitoring automation - as parameterization is inbuilt.

 

- RamD

Thanks for the replies.

 

RRS - that's good to know you can do that, but that's not related to what I was asking.

 

Ram - you got it - and I understand HPOM can't deal with multiple config variables like this, but that's the problem, the real world needs a facility like this, and HPOM can't provide it very well.  I will look into what you said as a possible option, but I am not sure that will work.  Having an override facility is good, but it's not been addressed fully.

 

This is just one example of how the product developers "miss the point" of how real companies work, and they need to get closer to real people.

 

There are many more issues -  outage handling, scheduled maintenance, out of the box monitoring, dropping the visual part of the motif GUI, the java GUI, dropping NNM from HPOM etc.

 

All of these things come down to HP product development not listening to what real customers need from a monitoring tool.  We have to work around the limitations of the tool to get what is required to monitor a company's IT infrastructure.

 

If HP took the time to listen to what real customers needed with dedicated forums, workshops, meetings at Universe, then we really could make this product a world leader, as it used to be.  That is what I want.

 

Obviously we would have to address pricing, as that is forcing many companies I know to ditch HP software, but that's something that HP are not addressing and continue to ignore in my opinion.

 

Thanks guys, as always,

Regards

Dave

protocolsoftware.com

@openview

Ram - not sure how a configfile policy would work here.  What would go in it for the examples mentioned?

 

I think when I accidentally used a configfile policy rather than a nodeinfo policy for this before it "expanded" the wildcarding - for example /usr/* got expanded to whatever was on the node, which was not what was wanted, and would not be future-proof for FS additions.

 

Regards

Dave

 

protocolsoftware.com

@openview
Account_Closed
Not applicable

The config file is just that - it only deploys as a file that one could read on the node. so the format is left to you.

What i am suggesting is to have a script that reads the 1 or more config files, applies the hierarchical over-riding mechanism to resolve conficts and runs the appropriate ovconfchg calls to set the finalized thresholds.

Let's discuss next week if you need more clarity.

That said - this is what OMi-MA would do. So do note HP already has a solution, based on the requirements. 🙂

HTH
- ramd

Ok that doesn't sound easy though, it sounds complicated!

 

Having OMi do it better doesn't help though if you don't have OMi !

 

HP often says "it's far better in tool X" - I have seen it said in the past, OM9 is much faster, and does everything much better, but that was always pure sales speak.

 

Appreciate your feedback though, it's just I can never tell a client "it works better in another product" when they have spent hundreds of thousands of pounds on this one, and it doesn't always do the job.

 

Thanks guys, good discussion as always.

Cheers

Dave

@openview

One for RRS,

 

For the ovconfchg overrides, you mention this as per the manual:

 

<name>=<policy name>/<condition description>/<threshold value>:<reset value>

 

I have a process monitor that I would like to suppress 1 of the conditiions for.  I have 10 solaris processes for example I monitor, but on a few servers 1 of the processes does not exist.

 

Can I use this type of override to prevent the process from firing as down?

 

I have tried it, but the process is /usr/lib/picl/picld - so it has / characters in it which obviously doesn't look like it might be handled by default.

 

Regards

Dave

@openview
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.