Having problems with your account or logging in?
A lot of changes are happening in the community right now. Some may affect you. READ MORE HERE
Scott_14 Absent Member.
Absent Member.
116 views

Reading Glance I/0 Wait Que

Jump to solution
Hello Admins:

Well, I know I have a couple disk that are a bottle neck, however I am trying to fully understand something as I gather my information.

Using Glance, under reports I/O by disk, and if you drill to the next screen, in the upper left there is a Wait Que Length.

No wait que
1-2 queue 100% sometimes 80
3-4 9%
58 sometimes 4 or something.

Is this the percentage that things are waiting in a que to be processed or served by the disk?


I am not sure if I am reading this correctly.

Thanks
0 Likes
1 Solution

Accepted Solutions
Stefan Farrelly Absent Member.
Absent Member.

Re: Reading Glance I/0 Wait Que

Jump to solution
I read it as 4% of the time there are 5-8 jobs waiting the in the queue to get IO from the disk, 9% of the time there are 3-5 jobs waiting, and the vast majority only 1-2 jobs. This is fine, if you have the majority of the time 3-4 or 5-8 jobs waiting then you are seriously IO bound.
Im from Palmerston North, New Zealand, but somehow ended up in London...
6 Replies
G. Vrijhoeven Absent Member.
Absent Member.

Re: Reading Glance I/0 Wait Que

Jump to solution
Hi Scott,

Are you able to run gpm ? If so you can use the ? on left top of the screan and goto the metric you want specified and select it. You will get answers.

HTH,

Gideon
0 Likes
Stefan Farrelly Absent Member.
Absent Member.

Re: Reading Glance I/0 Wait Que

Jump to solution
I read it as 4% of the time there are 5-8 jobs waiting the in the queue to get IO from the disk, 9% of the time there are 3-5 jobs waiting, and the vast majority only 1-2 jobs. This is fine, if you have the majority of the time 3-4 or 5-8 jobs waiting then you are seriously IO bound.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Highlighted
Contributor.. Bill Hassell Contributor..
Contributor..

Re: Reading Glance I/0 Wait Que

Jump to solution
In Glance, there is a neat feature: h
When you are looking at the IO by Disk, type h, press Return to select Curren Screen Metrics and then select Qlen to see what Queue Length means. It is actually physical disk I/O requests and is not a measure of processes. At any millisecond of time, there may be zero or more I/O requests in the queue. The Qlen metric measures the average depth during the imeasurement interval. After a queue depth of 1, higher numbers means that requests are being queued faster than the disk can service them. But remember that a large queue depth (like 5 or more) is only meaningful if it exists for long periods of time.

So in your case, the large depth doesn't exist most of the time. Now that doesn't mean that this disk isn't busy. A queue depth of 1 could be continuous from a single process. Most processes do not queue multiple I/O's to the same disk at the same time--they are sent one at a time. A disk queue more than 1 is usually created by multiple processes talking to the same disk.

A large queue depth can be a symptom of too many data files on the same physical disk. By that, I mean that several processes need independent access to same physical disk and by moving some files to other disks, the queue will be reduced and I/O's can proceed in parallel with other disks.


Bill Hassell, sysadmin
Scott_14 Absent Member.
Absent Member.

Re: Reading Glance I/0 Wait Que

Jump to solution
Thank you, and Thanks Bill for the explaination. However I been tring to use the h and return with no results.

I am tring to get them, (application and such folks) to move some stuff to like you mentioned another disk. I am pretty confident I know the disk, and the bottle neck, I wanted to be able to add more information and understand it, as I present them why I beleive the disk is being hammered so hard.

scott

0 Likes
Contributor.. Bill Hassell Contributor..
Contributor..

Re: Reading Glance I/0 Wait Que

Jump to solution
Are you running glance or gpm? Glance is the package name and has two very different programs: glance (for terminals) and gpm for graphs in Xwindows. When you start glance, you can type ? and a set of one-letter commands will be displayed. You can also use the programmable softkeys to navigate through the screens. If typing a single character such as h or d or m produces no results, you probably have an oddball terminal (or emulator). The one-letter commands are defined in the man page. Try using a HP terminal or emulator. If you are using Xwindows, start hpterm rather than dtterm or xterm. For PCs, Relection for HP (not Reflection/X) is the best or QCTerm which is good (and free).


Bill Hassell, sysadmin
0 Likes
Michael Weinsto Absent Member.
Absent Member.

Re: Reading Glance I/0 Wait Que

Jump to solution
Hi,

I wrote a document explaining these metrics a while back. Posting it here for general consumption.

Cheers,

Michael Weinstock



Problem:
What is the relationship between the BYDSK_QUEUE_*_UTIL METRICS and the
BYDSK_REQUEST_QUEUE as shown in GlancePlus ?

How might I use these metrics in troubleshooting ?

How do we generate the BYDSK_REQUEST_QUEUE ?

======================================================================


> 1. Customer wanted to find out how do we get the value for
>
> BYDSK_QUEUE_0_UTIL
> BYDSK_QUEUE_2_UTIL
> BYDSK_QUEUE_4_UTIL
> BYDSK_QUEUE_8_UTIL

The first thing to do in this case as with any metrics is refer to the online help subsystem. This is easily accessed in gpm by clicking on the question mark on the right hand side at the top under the help menu,
and then clicking on the heading of the metric you are interested in.

If you are using the character mode of glance, press 'h' to start the help subsystem.

The help text for BYDSK_QUEUE_0_UTIL states:

"The percentage of intervals during which there were no IO requests pending for this disk device over the cumulative collection time.

For example if 4 intervals have passed (that is, 4 screen updates) and the average queue length for these intervals was 0, 1.5, 0, and 3, then the value for this metric would be 50% since 50% of the intervals had a zero queue length."

The help text for BYDSK_QUEUE_2_UTIL says:

"The percentage of intervals during which there were 1 or 2 IO requests pending for this disk device over the cumulative collection time.

For example if 4 intervals have passed (that is, 4 screen updates) and the average queue length for these intervals was 0, 1, 0, & 2, then the value for this metric would be 50% since 50% of the intervals had a 1-2 queue length."

The help text for rest of these BYDSK_QUEUE_*_UTIL metrics is exactly the
same except that the number of io requests pending is larger.


Before we delve into a deeper explanation lets make sure you are comfortable with the meaning of "interval" in relation to glanceplus and performance measurement. This is refering to the measurement period that we are dealing with. In other words, this is the length of time between screen updates and can be changed from the Configure -> Measurement menu in gpm, or by using the 'j' key in character mode glance.

If you are trying to characterize specific disk activity or jobs, then it would probably make sense to 'reset' these metrics as the job or process(es) being characterized starts up.

You can reset these metrics at any time by using the File -> Reset CUM to
zero menu or Ctrl^Z. This can be useful for characterizing short bursts of disk activity as over time, these metrics will decrease quite rapidly after a burst of disk activity has finished as more measurement intervals
pass.

Another way in which we might describe these metrics is as IO queue length 'buckets' over time. These metrics do NOT measure the number of IO requests in an interval, but rather an average of how deep the disk queue length has been over the cumulative intervals since we first brought up the 'IO by Disk'
report screen. Another term I might use to describe these is as 'rolling indicators of disk queue depth'.

Unless the system suffers from a prolonged and severe disk bottleneck, I would expect the BYDSK_QUEUE_2_UTIL, BYDSK_QUEUE_4_UTIL , BYDSK_QUEUE_8_UTIL and BYDSK_QUEUE_X_UTIL metrics to remain quite low, especially if they are never reset.

One last example to check your understanding:

If a disk had a queue length of 6,7,5,1 over 4 subsequent measurement intervals from first entering the "IO by Disk" report screen and resetting the values to zero, the BYDSK_QUEUE_8_UTIL (which measures queue depth from 5-8) would show a value of 75% which means that during 75% of the intervals since these metrics were last reset, there was between 5-8 physical IO requests pending on the wait queue for the disk in question.


> 2. How does this relate to BYDSK_REQUEST_QUEUE?

Thats the easy part. The metric help text for this metric says:

"The average number of IO requests that were in the wait queue for this disk device during the interval. These requests are the physical requests (as opposed to logical IO requests)."

This is the number (NB: the BYDSK_QUEUE_*_UTIL metrics were percentages,
whilst this is a count) of physical IO requests that had to wait for disk access during the last interval (this is NOT a rolling average like the
BYDSK_QUEUE_*_UTIL metrics). IO requests that did not have to wait for disk access are NOT included in this count. It is possible to have a disk being heavily utilized (BYDSK_UTIL) but with little or no IO requests waiting on the queue. This situation simply means that the disk is coping well with the current load of IO requests.

The difference here is that the BYDSK_QUEUE_*_UTIL metrics are trying to
give the glance user a picture of how deep the queue (when occupied) has been over a period of time, whilst the BYDSK_REQUEST_QUEUE metric simply shows how many IOs had to wait in the queue to be serviced over the last interval.

> 3. The following numbers appear in Disk Detail Screen in gpm.
>
> No Waiting : 94%
> 1-2 Queued : 5%
> 3-4 Queued : 1%
> 5-8 Queued : 0%
>
> Queue Length : (which is 0.0 all th time)

This is probably quite healthy for a system that is under heavy disk load. In this example, 94% of all physical IO requests don't have wait to be serviced and only a small percentage of intervals have presented IOs that have had to wait in the disks' IO queue.

> I also notice
> BYDSK_QUEUE_0_UTIL
> BYDSK_QUEUE_2_UTIL
> BYDSK_QUEUE_4_UTIL
> BYDSK_QUEUE_8_UTIL
> does not change frequently. Once in a while it change as well
> (sometimes higher, sometimes lower).

If you reset the cumulative values to zero and then apply some disk load you may see these change. The most likely reason that these are not seen these change is because of the number of intervals that have passed since the last time these metrics were reset is high and/or that the disk has not had many IO requests queued, especially not at any
great depth.


Finally some information about HOW these metrics are generated and some differences between various disk drivers:

> How do we generate the BYDSK_REQUEST_QUEUE ?

This metric (BYDSK_REQUEST_QUEUE) is an attempt to examine the disk queue length randomly throughout the interval, and then divide by the number of examinations to obtain the average. However, the disk queue
is only examined with the queue-done trace. That is, it is only examined WHEN THERE ARE IOs ON THE QUEUE.

This means, that unless IOs are actually on the disk queue, (which generates trace(s) that midaemon will read and convert to metric data), we do not reexamine the queue.

Various drivers return different values for the queue-length. The SCSI driver returns the queue length before the IO requests have been removed from the queue, while the FL driver returns the queue length after.
In other words, a queue length of 1 with 1 completing shows a queue length of

SCSI == 1
FL == 0

What this highlights is that the scsi and fibre link drivers generate traces
differently and thus these metrics will look different on a scsi disk versus a fibre link disk.
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.