Absent Member.
Absent Member.
2622 views

GWIA stops processing messages and link goes down/up

Running GW2012 SP1 on OES 11 SP1, physical servers, 2 nodes using NCS

We have started to experience issues with our GWIA, where as every couple of weeks the agent will stop processing messages and gwmon will fire off a bunch of link down and link up messages. When you go to the http admin portal, all the message queues will show a negative one (-1). The only way to fix the issue is to manually stop and then start the service at the command line (rcgrpwise stop gwia.domain & rcgrpwise start gwia.domain).

When I look at the logs I do notice errors like the following:

05:33:39 F2B7 Error - Unable to create send file
05:33:39 F2B7 MSG 25584260 Error: Fatal error processing message
05:33:39 F2B7 MSG 25584260 Deferred delivery file memory error -- message undeliverable

or

5:34:50 F477 The agent could not read this message from the message transport.
05:34:50 F477 Too many files open Error, error code = 820A.
05:34:50 F477 MSG 25562878 Deferred delivery file memory error -- message undeliverable.

Which reminds me of the issues I had in earlier version of GW 8 (this system was upgraded to 2012 SP1 from GW 8.x), where I had to modify the limit.conf file. I thought those issues were cleared up with GW8SP2 and haven't seen anything that references GW2012. The old modifications are still in place from before (never had a reason to remove them), there is also plenty of free space on the volume that this GWIA/Domain run off (NSS). Not sure where else to loo on this one.
Labels (2)
0 Likes
10 Replies
Vice Admiral
Vice Admiral

I would focus on
Too many files open Error, error code = 820A
0 Likes
Absent Member.
Absent Member.

Looked into that, but cannot find out why it is having this issue since

- GroupWise 8.0.2 and above should have had this issue corrected
- the workaround for older version before 8.0.2 is still in place
- there is more than enough space
- no other agents on the same node (2 POAs, 1 MTA and another GWIA) are having such issues.
0 Likes
Micro Focus Expert
Micro Focus Expert

By default, the maximum number of open files on a Linux server is 1024. Are you sure you are not hitting this maximum?

The ulimit -a command will show the maximum number
of open files. For OES Linux/Suse servers, the default is 1024 files. On
a busy Webaccess server, this will be too low, particularly if the
agents are running as root. Additionally, all resources in linux are
treated as a file, including sockets, so it*s important each process has
enough file handles to do its job." http://ngwlist.com/pipermail/ngw/2007-October/116289.html

Cheers,
Laura Buckley

Views/comments expressed here are entirely my own.
If you find this post helpful, please show your appreciation and click on "Like" below...
0 Likes
Absent Member.
Absent Member.

As mentioned previously the workaround for GW8 is still in place on this server, so in limits.conf I have


> cat /etc/security/limits.conf | grep nofile
* soft nofile 8192
* hard nofile 65535


I also checked the max files for sysctl as explained in the link to the NGW list thread and I got:

> sysctl -a | grep fs.file-max
fs.file-max = 807716

From both of the above it would seem to be a limit on the number of open files. Or am I incorrect in that assumption? When I run "lsof | wc -l" I usually get numbers from 10000-13000, but that is for the entire server correct? where as the soft limit is for an individual process.
0 Likes
Micro Focus Expert
Micro Focus Expert

Sorry! It was worth a try!

Cheers,
Laura Buckley

Views/comments expressed here are entirely my own.
If you find this post helpful, please show your appreciation and click on "Like" below...
0 Likes
Knowledge Partner Knowledge Partner
Knowledge Partner

In article <bogdansk.5p9z1d@no-mx.forums.novell.com>, Bogdansk wrote:
> As mentioned previously the workaround for GW8 is still in place on this
> server, so in limits.conf I have
> > cat /etc/security/limits.conf | grep nofile

> * soft nofile 8192
> * hard nofile 65535


Perhaps it might be time to remove those for now given that the particular
issue at that time was a bug that was patched. I don't have any such
entries on my systems, and it is possible that they now represent road
blocks.

> I also checked the max files for sysctl as explained in the link to the
> NGW list thread and I got:
> > sysctl -a | grep fs.file-max

> fs.file-max = 807716


I'm thinking that this is just a representation of what is allocated and
not a limit as that number is all over the place (404080, 153936, 290420,
& 290420) on servers I'm looking at now that I've not had to dive into
that level. Or that application installations such as GroupWise do
increase it to be proactive, but my samples are all GW8 systems at one
client (getting ready for GW2012). It might be worth A) logging what
those numbers are at different times on different servers, B) bringing
this up in the SuSE.Support.Server.Configure-Administer forum to get to
the bottom of this. Or there is always the open an SR option.


Andy Konecny
Knowledge Partner (voluntary SysOp)
KonecnyConsulting.ca in Toronto
----------------------------------------------------------------------
Andy's Profile: http://forums.novell.com/member.php?userid=75037


___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Cadet 2nd Class Cadet 2nd Class
Cadet 2nd Class

Did you ever find the underlying issue?

I too am experiencing every couple of weeks running gw12sp1 on OES 10sp3. Before I was running 8.0.3 and never had the issue

Error - Unable to create send file
07:02:01 F145 MSG 6803132 Error: Fatal error processing message
07:02:01 F145 MSG 6803132 Deferred delivery file memory error -- message undeliverable.

Just checking before I open an SR

Thanks
Christa
0 Likes
Absent Member.
Absent Member.

From the NGW list I have heard that it is a know socket (file) leak issue in the code for the GWIA (v2012SP1). They have corrected the issue in the private beta of SP2, but it won't get released to public until general release of SP2 in a month or two.
0 Likes
Absent Member.
Absent Member.

Hello,
we are having the same issue with same symptoms as described above , raised an SR and were pointed to a beta release of SP2
which of course we cannot deploy in production and have to wait for a fully tested SP2 .
Our environment is SLES11 SP1 and OES11.
Hopefully engineering will get a riggle on and supply soon as there are a number of pressing issues to be fixed.
Regards DEE
0 Likes
Absent Member.
Absent Member.

We also had this error pop up last night, thanks for posting you saved me the time of opening an SR.


>>> bogdansk<bogdansk@no-mx.forums.novell.com> 16/01/2013 10:46 AM >>>


Running GW2012 SP1 on OES 11 SP1, physical servers, 2 nodes using NCS

We have started to experience issues with our GWIA, where as every
couple of weeks the agent will stop processing messages and gwmon will
fire off a bunch of link down and link up messages. When you go to the
http admin portal, all the message queues will show a negative one (-1).
The only way to fix the issue is to manually stop and then start the
service at the command line (rcgrpwise stop gwia.domain & rcgrpwise
start gwia.domain).

When I look at the logs I do notice errors like the following:

05:33:39 F2B7 Error - Unable to create send file
05:33:39 F2B7 MSG 25584260 Error: Fatal error processing message
05:33:39 F2B7 MSG 25584260 Deferred delivery file memory error --
message undeliverable

or

5:34:50 F477 The agent could not read this message from the message
transport.
05:34:50 F477 Too many files open Error, error code = 820A.
05:34:50 F477 MSG 25562878 Deferred delivery file memory error --
message undeliverable.

Which reminds me of the issues I had in earlier version of GW 8 (this
system was upgraded to 2012 SP1 from GW 8.x), where I had to modify the
limit.conf file. I thought those issues were cleared up with GW8SP2 and
haven't seen anything that references GW2012. The old modifications are
still in place from before (never had a reason to remove them), there is
also plenty of free space on the volume that this GWIA/Domain run off
(NSS). Not sure where else to loo on this one.


--
bogdansk
------------------------------------------------------------------------
bogdansk's Profile: http://forums.novell.com/member.php?userid=110
View this thread: http://forums.novell.com/showthread.php?t=463181
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.