Commodore
Commodore
510 views

GWIA: "link recovery" messages

GW 18.1.1 -134884 on OES 2018 SP1

Since I have installed a new server for GWIA for IMAP/POP I see a lot of messages like this in the log:

1021gwia.214:16:55:50:946 6FF0 The agent is attempting to reestablish the GroupWise link.
1021gwia.214:16:56:00:949 6FF0 GroupWise link recovery successful.
1021gwia.217:16:56:20:958 6FF0 The agent is attempting to reestablish the GroupWise link.
1021gwia.217:16:56:30:961 6FF0 GroupWise link recovery successful.
1021gwia.21d:17:00:30:043 6FF0 GroupWise link recovery successful.
1021gwia.21f:17:01:10:061 6FF0 GroupWise link recovery successful.
1021gwia.225:17:18:20:481 6FF0 The agent is attempting to reestablish the GroupWise link.
1021gwia.227:17:18:40:491 6FF0 GroupWise link recovery successful.
1021gwia.22d:17:23:10:582 6FF0 The agent is attempting to reestablish the GroupWise link.
1021gwia.22d:17:23:20:584 6FF0 GroupWise link recovery successful.

GWIA also sends "Agent Information" mails containing this:

The communications link to the GroupWise mail system is back up.

I don't see any lost links, and I don't have any reports about problems.

Any idea, what's the meaning or cause of these messages?

Thanks,

Mirko

Labels (1)
0 Likes
8 Replies
Knowledge Partner Knowledge Partner
Knowledge Partner

Since I have installed a new server for GWIA for IMAP/POP I see a lot of messages like this in the log:

I haven't seen those before.

Did they occur before you even had users pointing to them?

How is this GWIA connected to the rest of the system?   how close? (hops, ping times)
Is it on its own server without any other GW agents?
do those messages correspond to busy times or do they continue at about the same rate through the quietest times?

 

 

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Commodore
Commodore

I don't have old enough logfiles, so I can't tell for sure, but I think the messages only came with user traffic; also the messages come most frequently during working hours - so I think this somehow has do to with user traffic.

Our system:

  • primary domain with 4 post offices on three OES 2018 servers.
  • secondary domain with MTA and GWIA for SMTP (and with one post office for testing only), OES 2018 SP1
  • secondary domain with MTA and GWIA for IMAP/POP3, OES 2018 SP1

All servers on VMware ESXi. Secondary MTAs and GWIAs are on two separate servers.

Ping times currently, between IMAP server and the server with primary domain and one post office:

87 packets transmitted, 87 received, 0% packet loss, time 85998ms
rtt min/avg/max/mdev = 0.325/0.448/1.783/0.168 ms

Is network connectivitiy between the ESXi hosts too bad? I think, I'll try to run both servers on the same host and see what happens.

Thanks,

Mirko

 

0 Likes
Micro Focus Expert
Micro Focus Expert

@mguldner 

Hi Mirko,

Go to /etc/sysconfig/ and edit the grpwise file, changing this line:

GROUPWISE_MAX_OPEN_FILE_HANDLES=""

to this

GROUPWISE_MAX_OPEN_FILE_HANDLES="200000"

Restart the the GroupWise agents on the server.

Cheers,

 

Laura Buckley

Views/comments expressed here are entirely my own.
If you find this post helpful, please show your appreciation and click on "Like" below...
0 Likes
Commodore
Commodore

Hi Laura,

thanks for the hint, I will try this.

In the meantime I found another symptom, in Ganglia there was a strange stair-pattern of CPU usage:

clipboard_image_0.png

After I have restarted the GWIA yesterday I did not see the messages again.

Does this symptom fit the image, do you think this might be caused by MAX_OPEN_FILE_HANDLES?

Thanks,

Mirko

0 Likes
Knowledge Partner Knowledge Partner
Knowledge Partner

In the meantime I found another symptom, in Ganglia there was a strange stair-pattern of CPU usage:

I'd be taking a good look at what is busy with top when ganglia shows it that high. There might be something else that is stuck and getting in the way.  Forcing the GROUPWISE_MAX_OPEN_FILE_HANDLES="200000"  may only be a work around something else that is troubling the system.

by default top shows the highest CPU use process as the top process, but some useful additional commands in this case are to press 1 to see all the cores assigned to see if it is a single threaded issue or more. and pressing c (lowercase c) to show more details about how the process was launched. Both are toggles so pressing them again turns them off.

 

 

 

 

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Commodore
Commodore

Today it happend again, with our other GWIA (for IMAP) this time:

clipboard_image_0.png

clipboard_image_2.png

I found 2 threads hanging:

clipboard_image_3.png

Again restarting the GWIA solved this. Is there a way to kill these threads without restarting the GWIA? Is increasing GROUPWISE_MAX_OPEN_FILE_HANDLES a remedy against hanging threads?

Thanks,

Mirko

0 Likes
Knowledge Partner Knowledge Partner
Knowledge Partner

Again restarting the GWIA solved this. Is there a way to kill these threads without restarting the GWIA? Is increasing GROUPWISE_MAX_OPEN_FILE_HANDLES a remedy against hanging threads?

If we could identify the PID of the offending thread if it is actually running as separate PID,  then the standard kill command might do. 

What do you see when you click on that active thread?  I don't have a system with active IMAP that I can look at currently so I can't play with it.

Increasing the handles is just a delaying tactic and doesn't get to the root cause.   Do those GWIMAP-SSL-Handler_## threads get progressively used up? Seeing those two at well over an hour is suspicious. Makes me wonder if something is just hanging on to a connection and that is getting in the way, with a bad IMAP client being a possible cause. If we can find out who has those connections open for long times we might see a pattern. is it always the same client(s)?

 

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Commodore
Commodore

Looking for a pattern was my thought too - but until now, I don't see one. Unfortunately it's not always the same user causing this, that would be too easy 🙂

clipboard_image_4.png

clipboard_image_5.png

clipboard_image_6.png

6 different users, 6 different IP addresses.

I did not see the handler threads getting used up, in one case two where "hanging" in another case four; in both cases most of the threads were "idle".

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.