Absent Member.. Absent Member..
Absent Member..
2253 views

CPU bottleneck - I cannot see events

hello experts 😄

 

 

I've tried to deploy Infra-SPI to monitor CPU bottleneck without success. I cannot see any events in the console about CPU usage or bottleneck even setting low threshold to force events.

Management Server: OMW 9

Managed Node:
HP-UX isxh2023 B.11.31 U ia64 3002978126 unlimited-user license
OA version: 11.14.014

Policies deployed (just about CPU usage)

MONITOR "SI-CPUBottleneckDiagnosis" enabled 1114.0000
MONITOR "SI-CPUBottleneckDiagnosis-test" enabled 0001.0008
MONITOR "SI-CPUBottleneckDiagnosis_Dazel" enabled 0001.0011

OPCMSG "InfraSPI-Messages" enabled 1111.0003
OPCMSG "OA-PerfCollComp-opcmsg" enabled 1100.0000
OPCMSG "opcmsg" enabled 0011.0012

Debug on:

Trace message from SI-CPUBottleneckDiagnosis_Dazel:
Current Values (Trace):

Total CPU utilization in percentage = 17(Threshold = 1)
CPU Interrupts Rate/second = 865
Context Switches Rate/second = 792

Labels (1)
0 Likes
17 Replies
Vice Admiral
Vice Admiral

Does your User Profile (or User) have the required Responsibilities (Node Groups + Message Groups) to be able to see the alarms?

0 Likes
Absent Member.. Absent Member..
Absent Member..

yes, it does. Admin profile..

0 Likes
Vice Admiral
Vice Admiral

Are all Agent processes up on the node (perfstat)?

0 Likes
Absent Member.. Absent Member..
Absent Member..

yeap.. all process running well..

 

 

 

 

# /opt/perf/bin/perfstat
**********************************************************
*** perfstat for isxh2023 on Thu Dec 11 06:43:47 CST 2014
*** HP-UX isxh2023 B.11.31 U ia64 3002978126 unlimited-user license
**********************************************************

list of performance tool processes:
----------------------------------

Perf Agent status:
Running scopeux (Perf Agent data collector) pid 5133
Running midaemon (Measurement Interface daemon) pid 4919
Running ttd (ARM registration daemon) pid 13589

Perf Agent Server status:

Running ovcd (OV control component) pid 4918
Running ovbbccb (BBC5 communication broker) pid 4920
Running coda (perf component) pid(s) 4921
Configured DataSources(1)
SCOPE

Running perfalarm (alarm generator) pid(s) 5149
Running perfd (perfd daemon (real time server)) pid 4906
OV Operation Agent status:
agtrep OV Discovery Agent AGENT,AgtRep (5034) Running
coda OV Performance Core COREXT (4921) Running
opcacta OVO Action Agent AGENT,EA (4999) Running
opcle OVO Logfile Encapsulator AGENT,EA (5015) Running
opcmona OVO Monitor Agent AGENT,EA (5017) Running
opcmsga OVO Message Agent AGENT,EA (4977) Running
opcmsgi OVO Message Interceptor AGENT,EA (5013) Running
opctrapi OVO SNMP Trap Interceptor AGENT,EA (5022) Running
ovbbccb OV Communication Broker CORE (4920) Running
ovcd OV Control CORE (4918) Running
ovconfd OV Config and Deploy COREXT (4922) Running
rtmd HP Real Time Measurement AGENT (5036) Running


************* (end of perfstat -p output) ****************

0 Likes
Vice Admiral
Vice Admiral

Anything in /var/opt/OV/log/Infraspi.txt?

0 Likes
Fleet Admiral Fleet Admiral
Fleet Admiral

Hi There,

 

I want to know if you are receving other measurement threshold alerts like Disk, Memory?

 

Have you made any changes to the policy? If possible download and attach the policy.

- Vidyasagar Machani -

Tell me and I forget. Teach me and I remember. Involve me and I learn. -- Benjamin Franklin
0 Likes
Absent Member.. Absent Member..
Absent Member..

hi Raymond,

 

nothing odd.. also I can see logs over when the trace was enabled.

 

some of them: 

 

0: INF: Thu Dec 4 14:16:52 2014: SI-CPUBottleneckDiagnosis-test (29611/1): Current Values (Trace):

Total CPU utilization in percentage = 18(Threshold = 1)
CPU Interrupts Rate/second = 868
Context Switches Rate/second = 794
0: INF: Thu Dec 4 14:18:55 2014: SI-CPUBottleneckDiagnosis-test (29932/1): Current Values (Trace):

Total CPU utilization in percentage = 18(Threshold = 1)
CPU Interrupts Rate/second = 868
Context Switches Rate/second = 794
0: INF: Thu Dec 4 14:20:55 2014: SI-CPUBottleneckDiagnosis-test (353/1): Current Values (Trace):

Total CPU utilization in percentage = 15(Threshold = 1)
CPU Interrupts Rate/second = 855
Context Switches Rate/second = 775
0: INF: Thu Dec 4 14:22:55 2014: SI-CPUBottleneckDiagnosis-test (668/1): Current Values (Trace):

Total CPU utilization in percentage = 15(Threshold = 1)
CPU Interrupts Rate/second = 855
Context Switches Rate/second = 775
0: INF: Thu Dec 4 14:24:57 2014: SI-CPUBottleneckDiagnosis-test (965/1): Current Values (Trace):

Total CPU utilization in percentage = 15(Threshold = 1)
CPU Interrupts Rate/second = 855
Context Switches Rate/second = 775
0: INF: Thu Dec 4 14:26:58 2014: SI-CPUBottleneckDiagnosis-test (1240/1): Current Values (Trace):

Total CPU utilization in percentage = 20(Threshold = 1)
CPU Interrupts Rate/second = 944
Context Switches Rate/second = 932
0: INF: Thu Dec 4 14:28:58 2014: SI-CPUBottleneckDiagnosis-test (1541/1): Current Values (Trace):

Total CPU utilization in percentage = 20(Threshold = 1)
CPU Interrupts Rate/second = 944
Context Switches Rate/second = 932
0: INF: Thu Dec 4 14:31:00 2014: SI-CPUBottleneckDiagnosis-test (1857/1): Current Values (Trace):

Total CPU utilization in percentage = 15(Threshold = 1)
CPU Interrupts Rate/second = 836
Context Switches Rate/second = 784

0 Likes
Absent Member.. Absent Member..
Absent Member..

hi Vidyasagar,

 

nope, I cannot see other measurement alerts 😞

 

I have deployed some policies, the original one and some customized.

 

MONITOR "SI-CPUBottleneckDiagnosis" enabled 1114.0000 (Original Policy)
MONITOR "SI-CPUBottleneckDiagnosis-test" enabled 0001.0008 (Custom policy - for test)
MONITOR "SI-CPUBottleneckDiagnosis_Dazel" enabled 0001.0011 (Custom policy - for test)

OPCMSG "InfraSPI-Messages" enabled 1111.0003
OPCMSG "OA-PerfCollComp-opcmsg" enabled 1100.0000
OPCMSG "opcmsg" enabled 0011.0012

0 Likes
Fleet Admiral Fleet Admiral
Fleet Admiral

Just to narrow down the issue, if you have deployed other monitor threshold policies (like cpu,mem) and you are not receiving alerts then I think it is clear enough to say that the monitor-agent component is not working properly!!

 

Try de-registering the opcmona and then re-registering!! (Hint: use ovcreg)

 

Instead of debugging the policy I would suggest you to debug the agent components.

- Vidyasagar Machani -

Tell me and I forget. Teach me and I remember. Involve me and I learn. -- Benjamin Franklin
0 Likes
Account_Closed
Not applicable

To the original post -

 

i hope that there's awareness here about the way the policy decides when to send an alert.

 

the logic is as below -

IF cpu utilization (summarized across all CPUs) is greater than the cpu utilization threshold level,

AND

   EITHER the process queue (load average) >= number of active CPUs

   OR the process queue length (load average) >= the set threshold for run queue length

 

So not only does the cpu utilization have to higher the run queue must be really long. how many CPUs are active on the hp-ux box?

 

here's the relevant portion of code -

#Evaluating CPU bottleneck situation.
if ( ($Session->Value('TotalCpuUtil') >= $Session->Value('GlobalCpuUtilCriticalThreshold')) && 
	( ($Session->Value('ProcQueueLen') >=  ($Session->Value('NumCPUs')) ||
	($Session->Value('ProcQueueLen') >=  $Session->Value('RunQueueLengthCriticalThreshold'))) ) )  {
	$Rule->Status(1);
}

 

BTW - what's the attempt here?

 

Also, which version of infraspi is this?

 

- ramd

Absent Member.. Absent Member..
Absent Member..

Hi ramd,

 

you rock! It makes sense for what I am looking for.

 

I am trying to raise alerts only for CPU bottleneck, doesn´t matter process queue or process queue length..

 

 

I´ve tryed to change the script to get only CPU usage without success, could you please help me on that?

 

 

It´s using the latest version.

 

 

thanks

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.