Highlighted
Super Contributor.
Super Contributor.
1829 views

High CPU utilization and connectivity lost to Post Office

We are experiencing a big problem on our big Post Offices (1200 users peak), since the GroupWise 2014 upgrade.

During normal work days, the CPU utilization peaks at 30%. The problem comes in with the big Saturday 02:00 check.
The check pulls a steady 25%, most of which is disk I/O, until Tuesday morning, when it seems it starts checking the msgxx.db's.

The total utilization then hovers about 95%( disk 60%), but the PO still feels normal. 2-3 times during the day however,
all client connectivity is lost for about 10-15 min, during which total CPU falls to 80%, but disk jumps to 75%.
Both Tuesdays this problematic part started around the same time. Also , all the logs just show a gap for that time. No errors.

Do you have any recommendations?

The server has 4 CPU's and 48GByte memory assigned.
Labels (1)
0 Likes
15 Replies
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: High CPU utilization and connectivity lost to Post Offic

Hi,

What SP version is your GroupWise 2014 backend install?

What operating system is this running on - Windows or Linux?

Please let us know - thanks.

Cheers,
Laura Buckley

Views/comments expressed here are entirely my own.
If you find this post helpful, please show your appreciation and click on "Like" below...
0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: High CPU utilization and connectivity lost to Post Offic

GW2014 SP2 and we are on SLES11 SP2, OES11.
0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: High CPU utilization and connectivity lost to Post Offic

Hi,

Awesome. When you are experiencing such high CPU utilization have you run "top" to see what exactly is "killing" the CPU? If you don't mind doing that and reporting back then we can see which GroupWise component (MTA, POA, DVA) is actually causing the issue. The reason I ask this is that there was issues in earlier versions of 2014 with the DVA running wild.

Cheers,
Laura Buckley

Views/comments expressed here are entirely my own.
If you find this post helpful, please show your appreciation and click on "Like" below...
0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: High CPU utilization and connectivity lost to Post Offic

We did move inactive mailboxes to a another POA.This will reduce the size of the POA, and will have minimum impact on the maintenance tasks.We also changed the sheduled jobs around, moving it to start earlier on a Saturday morning.Hopefullly this will solve the problem.
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: High CPU utilization and connectivity lost to Post Office

In article <plaubscher.726kzb@no-mx.forums.microfocus.com>, Plaubscher
wrote:
> The server has 4 CPU's and 48GByte memory assigned.

Am assuming you missed the sp1 on OES11 to match the sp2 level of
SLES11, otherwise that could be causing you some grief

That should be enough resources, so lets confirm that it is actually
seen by the OS and apps. Check during regular work day, off hours, and
when the GWCheck is running

- "Free -m" command will show what the OS sees and will show if it is
using it all up. Caching will certainly have 'used' it all up, so the
real test is if the system dips into swap much. If your swap space is
getting used up, then more RAM would help.

- "top" then press "1" which should then show all 4 CPUs. Are the
GWCheck threads spreading over all 4 CPUs. As Laura asked, confirm
which tasks are sucking the CPU to make sure we aren't hitting one of
those known issues.

For more bits on free and top,
http://www.konecnyad.ca/andyk/nixadmin.htm

Other things to look at are what file system are you running on? EXT3?
NSS? Other? They all have their optimizations that would make an
impact on this size of a system. Salvage should be off for NSS, noatime
and nodiratime should be set for any file system that it applies (both
NSS and EXT3)



Andy of
http://KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: High CPU utilization and connectivity lost to Post Office

On 25.08.2015 15:16, plaubscher wrote:
>
> GW2014 SP2 and we are on SLES11 SP2, OES11.
>
>

That would be an unsupported combination, GW2014 officially starts at
SLES11SP3, aka OES11SP1.

On top, we know nothing about your storage, but your issue is very
clearly storage related.

CU,
--
Massimo Rosen
Novell Knowledge Partner
No emails please!
http://www.cfc-it.de
CU,
--
Massimo Rosen
Micro Focus Knowledge Partner
No emails please!
http://www.cfc-it.de
0 Likes
Highlighted
Super Contributor.
Super Contributor.

Re: High CPU utilization and connectivity lost to Post Offic

Sorry my mistake, I am just confirming my SLES and OES versions as under-mentioned.It is not, as mentioned as above.I will try the suggestions.

#cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 3
# cat /etc/issue
Welcome to SUSE Linux Enterprise Server 11 SP3 (x86_64) - Kernel \r (\l).
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: High CPU utilization and connectivity lost to Post Offic

mrosen;2405585 wrote:
SLES11SP3, aka OES11SP1.


For the record, that should read SLES11SP3, aka OES11SP2. I know it's a typo on your side Massimo, but just to clear it up.

As for seeing a higher load on Linux servers running GroupWise 2014 vs the load they had when running GroupWise 2012 (e.g. simply comparing before and after upgrade load), it does seem "something" is causing more load when running GroupWise 2014.

I haven't been able to pinpoint it myself, but disk I/O seems comparable... yet the reported system load (top) does peak higher (without seeing DVA or other "stuff" continually causing CPU paeking).

"Thinking out loud":
Verbose logging on the POA, in my case, so far has not show pointers to what might be causing a generic higher load (2012 vs 2014). So I'm curious what's causing the load increase that we are experiencing after an upgrade.
On a side note, since upgrading to 2014 SP2 I have been seeing Mobility nag about searches without a filter set (don't have the exact message ATM). Mobilty servers in question are running 2.1 code.
That could possibly be one thing that's putting more load on the POA and slowing other stuff down.

Cheers,
Willem
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: High CPU utilization and connectivity lost to Post Offic

magic31;2408791 wrote:
...On a side note, since upgrading to 2014 SP2 I have been seeing Mobility nag about searches without a filter set (don't have the exact message ATM). Mobilty servers in question are running 2.1 code.


This is the one I mean : EA18 Searching over the entire mailbox requires a filter

Could well be a red herring.... but I don't remember seeing it before having upgraded a POA to GroupWise 2014 SP2, and certainly not as much as I've seen it after the upgrade.
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: High CPU utilization and connectivity lost to Post Offic

The EA18 'Errors' are a buggy communication between GMS 2.1 and GW 2014.
Is fixed in the coming GMS 2.2 / GW Cornell releases.

There is a engineering build for GMS 2.1 that fixes the problem - maybe NTS can help you get it. Bug ID is: 948060
Highlighted
Knowledge Partner
Knowledge Partner

Re: High CPU utilization and connectivity lost to Post Offic

MFaust;2408797 wrote:
The EA18 'Errors' are a buggy communication between GMS 2.1 and GW 2014.
Is fixed in the coming GMS 2.2 / GW Cornell releases.

There is a engineering build for GMS 2.1 that fixes the problem - maybe NTS can help you get it. Bug ID is: 948060


Great info, thanks! I had done a search on the error in relation with GMS but had not found anything. Thanks for the bug ID. Still wondering if that would have a noticeable negative effect on the performance, which seems plausible if it means many extra requests.

Cheers,
Willem
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.