Proactively testing your GroupWise systems IO performance

We've all taken the call, you know, the one from your CxO that GroupWise is slow, the old haystack and the needle call. Looking through the logs on the POA yields nothing, testing with a test account on the CxO's post office doesn't show the symptom that's been reported - yet you're taking calls at the helpdesk that things are slow. CPU looks good, no backup running, network team says everything looks good... there's no obvious cause, now what?!?

IO performance testing

We all want to avoid "the call". Performing some simple tests and keeping the results from when "things are good", can go a long way towards identifying whether or not disk IO performance is contributing to slowness. Here are some simple tests that should identify what sustained disk IO you can expect from your system.

Note: dd can be destructive, carefully review your command before executing it.

Sustained write test example:
sync;time -p dd if=/dev/zero of=/media/nss/GW/testfile bs=1024k count=10000

Sustained read test example:
sync;time -p dd if=/media/nss/GW/testfile of=/dev/null

Explanation of the test commands:

Sync, eg., force everything to be written to disk that's currently cached, copy in blocks of 1024k, a string of zeros, do this copy 10000 times to a file named /media/nss/GW/testfile, the second command then copies the testfile to a device named /dev/null (bitbucket in the sky).

What you should see:

If you have decent hardware and it's not a busy time for disk access, you should see throughput numbers that exceed 200MBPS on the sustained write test and 300MBPS on the read test. Re-run the test a few times then average the returned values to get a better picture of what the throughput numbers are when things are good. If you don't see at least these numbers, you'll want to closely watch any POA hosted on this server for signs of degradation in service delivery, or chose a different server to host your POA.

Sample results:
smoring@slowpoke:~/Desktop> sync;time -p dd if=/dev/zero of=/media/nss/GW/testfile bs=1024k count=10000
10000 0 records in
10000 0 records out
10485760000 bytes (10 GB) copied, 422.616 s, 24.8 MB/s
real 422.62
user 0.02
sys 13.64

As you can see from this example, the disk throughput during the sustained write test is less than 200MBPS. A well performing GroupWise system requires enterprise class disk throughput, without it, you'll be taking evasive action when the CxO comes looking for you...

