Highlighted
Outstanding Contributor.
Outstanding Contributor.
5778 views

ESM Health Monitoring

Hello,

I would like to share the ESM health monitoring package we have developed here in the CDC. This combination of content and scripts enables you to continuously monitor critical ESM health statistics such as IO, ESM heap usage, full jvm garbage collections, cpu, eps, connector caching, and whether the number of max allowed threads is exceeded. We also monitor how "real time" is out environment, i.e. how much does it take an event to reach the manager after it was logged on the device, or difference between ET and MRT. In addition, we collect all errors and exceptions from server.log and display and trend them for easier access. We can even figure out what is the worst line of ESM code

This packaged was developed on ESM 6.5, but should work on other versions as well.

After installing the attached package, download the files under /All Files/HP/CDC/ENG/Health. All files except for arst_stats should go under /opt/software/scripts

arst_stats goes under /etc/cron.d/

If you are not using sda as the device which stores the events, change it in stats.sh

Create an empty file: /opt/software/scripts/output/stats.log and set a CEF file connector to read from this file. You have to set the connector to start reading from end of file (agents[0].startatend=true in agent.properties), otherwise you will get all events every time you restart the connector.

You should be good to go.

You will get IO, memory, GC, CPU stats every hour at 00:30, 01:30, etc. and EPS every 10 minutes.

Note that on ESM 6.5, due to an annoying bug, QV variables do not survive ESM restarts and every time you will restart the EPS component of dashboard will show up broken. To work around that:

  1. After you install the content for the first time, create and export a package with the QV /All Query Viewers/HP/GCS/ENG/Health/Performance/EPS
  2. After a manager restart, close the ESM Performance Stats dashboard and delete the above QV (accept the warning prompts).
  3. Install the package you backed up in (1). Note that if it’s already on the system you will have to delete it first (you can select leave resources).
  4. Open the ESM Performance Stats dashboard, and you should be okay.

Please file a bug to get it fixed.

Doron

/All Dashboards/HP/GCS/ENG/Health/Performance/ESM Performance Stats:

(In the EPS chart, red symbolizes events cached by the connectors).

health.JPG

/All Dashboards/HP/GCS/ENG/Health/Event Real Time/MRT-ET Drift Overview:

Real Time_.jpg


/All Dashboards/HP/GCS/ENG/Health/Errors and Exceptions/Errors and Exceptions:

Errors and Exceptions.jpg

Message was edited by: Doron Keller: Replaced arb as the version was outdated

Labels (2)
43 Replies
Highlighted
Acclaimed Contributor.
Acclaimed Contributor.

Re: ESM Health Monitoring

Hello! You did a great job!

How this package work with remote ESM?(i have 2 server - active and passive)

I not find file "arst_stats"...

0 Likes
Outstanding Contributor.
Outstanding Contributor.

Re: ESM Health Monitoring

Evgeny,

I uploaded an older arb version, this is fixed now. Thank you for pointing it out.

The scripts have to be installed on the server you want to monitor. If you want, you can point the connector to a central ESM. You will have to somewhat modify the content if you do it. Events with performance stats should have the machine on which they were generated in sourceHostname.

0 Likes
Highlighted
Respected Contributor.. Respected Contributor..
Respected Contributor..

Re: ESM Health Monitoring

Hi, the package seems very interesting

However on ESM 5.2, I was not able to install it ( Import Failed: Invalid archive:Element type "caseSensitiveType" must be declared.

Maybe the package will only run on 6.X branch

Does anyone as been able to install it under 5.X branch ?

Thanks!

0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: ESM Health Monitoring

I have the same issue on 5.x version

0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: ESM Health Monitoring

Hi all,

Unofficial way to import ESM6 packages to 5.x.

arb-files - simple XML in zip archive. You can extract this XML, find all lines related to "caseSensivity" and remove them.

After this you can archive it again, rename to *.arb and import to the 5.x version

Example:

<ActiveList id="H230WXEYBABD-KemJPdH7Iw==" name="AV_Current_Infected Host Count" versionID="AAAAAFtEasnjeIVA" contentVersionID="AAAAAFtEd-3jeIVB" action="insert" >

      <capacity>10000</capacity>

      <caseSensitiveType>0</caseSensitiveType>

But you can face with other possible issues (such as lightweight and pre-persistence rules).

0 Likes
Highlighted
Respected Contributor.. Respected Contributor..
Respected Contributor..

Re: ESM Health Monitoring

I think that the package will work on esm 5.5 .. i will give it a try in a few day. There is missing function in esm 5.2.

0 Likes
Highlighted
Respected Contributor.
Respected Contributor.

Re: ESM Health Monitoring

Hello ,

While importing this pkg in 5.5 i am getting below error

Import Failed: Invalid archive:Element type"SendAuditEvent"must be declared.

Regards,

Highlighted
Super Contributor.. Super Contributor..
Super Contributor..

Re: ESM Health Monitoring

I'm getting this same issue in 6.1. Can please someone assist in what issue we might be facing?

0 Likes
Highlighted
Outstanding Contributor.
Outstanding Contributor.

Re: ESM Health Monitoring

You are right. I don't think the content can be installed on 5.x because of changes to resources that happened since, but I think you can still use the scripts that extract the stats.

0 Likes
Highlighted
Respected Contributor.. Respected Contributor..
Respected Contributor..

Re: ESM Health Monitoring

Hi Doron,


we've just upgraded to esm 5.5 and now the use case is working. You really did a great job!!! I could add a suggestion with the script stats.sh to collect also the sar stats for ethernet adaptors.


Really nice work !

0 Likes
Highlighted
Outstanding Contributor.
Outstanding Contributor.

Re: ESM Health Monitoring

Thank you.

Yes, we also had ideas of other stats to add. The overall free memory on the box is a big one for use since we've seen related EPS issues. The problem is that real estate on the dashboard is running out

Doron

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.