Highlighted
Absent Member.
Absent Member.
1843 views

Some email not going out

Been having some weirdness on outbound email. It's been quite
sporadic, but it seems to be getting worse lately.

Some outbound emails that get sent seem to sit in a pending status from
within Groupwise. If that email is resent, or if a different email is
sent to that same address, they go right through.

Any ideas on this? Is it the email actually not going out, or is the
status message getting hung up somewhere?

Running gw2012sp2 (build 108211) on sles11sp3 / oes11sp2.

--
Stevo
Labels (2)
0 Likes
13 Replies
Highlighted
Knowledge Partner
Knowledge Partner

Re: Some email not going out

In article <07zxv.3928$BB4.3909@novprvlin0913.provo.novell.com>, Stevo
wrote:
> Some outbound emails that get sent seem to sit in a pending status from
> within Groupwise. If that email is resent, or if a different email is
> sent to that same address, they go right through.
>

Are these messages getting to GWIA? Lets rule out agent link problems
first.
What do GWIA's logs say about the message?

Are all your agents running the same build?

Do you send mail directly to the servers of the recipients or do you go
through any sort of relay such as for antivirus and/or compliance
monitoring?


Andy of
KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Highlighted
Micro Focus Expert
Micro Focus Expert

Re: Some email not going out

Hi Stevo

Further to what Andy has said/asked... Is there anything sitting in the "defer" directory in your GWIA file subsystem? Those would be email that have not yet been delivered and are deferred according to your retry interval.

Let us know.

Cheers,
Laura Buckley

Views/comments expressed here are entirely my own.
If you find this post helpful, please show your appreciation and click on "Like" below...
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: Some email not going out


> Are these messages getting to GWIA? Lets rule out agent link
> problems first.
> What do GWIA's logs say about the message?


Well, the email my boss said he sent first thing that morning does not
appear in the gwia log. The only email (at least that I can figure)
that he sent to this person was at 10:26:41, when boss said he sent one
first thing that morning.


> Are all your agents running the same build?


All are the same build (108211) aside from one POA that, per NTS, is
running build 115067.


> Do you send mail directly to the servers of the recipients or do you
> go through any sort of relay such as for antivirus and/or compliance
> monitoring?


Well, we do have a Gwava appliance that acts as our mail relay, but
I've had confusion about that as well. I have received undeliverable
bounceback emails that seem to look from my gwia, instead of some
undeliverable coming back from our Gwava box.

--
Stevo
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: Some email not going out

laurabuckley sounds like they 'said':

> Further to what Andy has said/asked... Is there anything sitting in
> the "defer" directory in your GWIA file subsystem? Those would be
> email that have not yet been delivered and are deferred according to
> your retry interval.


So my response to laurabuckley's comment is...

The only items in my defer directory are from earlier today, like 30-45
minutes ago, and those are auto-reply (out of office) emails from a
couple people in the IT dept to auto generated emails that do not have
a valid email address.

--
Stevo
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: Some email not going out

In article <r8Wxv.4001$BB4.1694@novprvlin0913.provo.novell.com>, Stevo
wrote:
> Well, the email my boss said he sent first thing that morning does not
> appear in the gwia log. The only email (at least that I can figure)
> that he sent to this person was at 10:26:41, when boss said he sent one
> first thing that morning.

This strongly suggests that there a linkage problem on the path from the
POA through the MTA(s) to GWIA. Is this a basic one PO, one Domain
system all running on one box including GWIA or are there more boxes and
agents involved?
We will need to look for stuck files in any of the wpcsin/# or wpcsout#
of all the agents involved, and possible the mslocal/mshold. From a
Windows client, Total Commander's Alt-Sft-Enter at those levels very
quickly shows the totals (should be either zero or one tiny file). Or
the Linux command on the server of
du -hx --max-depth=1
does the same. In either case be in the wpcs*/ level where you see the
numbers or just in the mslocal/mshold
If you have the web console to each of the agents running (which is a
very good idea) start by looking at the MTA(s) and the Links showing. I
suspect some are closing often enough and if so we need to fix that.

> Well, we do have a Gwava appliance that acts as our mail relay, but
> I've had confusion about that as well. I have received undeliverable
> bounceback emails that seem to look from my gwia, instead of some
> undeliverable coming back from our Gwava box.

We would have to look at a selection of those errors to tell exactly what
is happening. One suspicion is that some of those messages GWAVA can tell
right away are a problem and trigger the reject mid steam which would
have GWIA telling the user. Once the message is all the way over to
GWAVA, then it would appear as GWAVA notifying.


Andy of
KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: Some email not going out

Andy Konecny sounds like they 'said':

> This strongly suggests that there a linkage problem on the path from
> the POA through the MTA(s) to GWIA. Is this a basic one PO, one
> Domain system all running on one box including GWIA or are there more
> boxes and agents involved?


Not quite a basic setup. Have 7 PO's on 7 different servers, one of
which has the main MTA on it too. 1 server for webaccess, 1 server
with another MTA for my GWIA.

Primary MTA & GWIA MTA are on the same vlan, on virtual servers in the
same vmware cluster.




> We will need to look for stuck files in any of the wpcsin/# or
> wpcsout# of all the agents involved, and possible the mslocal/mshold.
> From a Windows client, Total Commander's Alt-Sft-Enter at those
> levels very quickly shows the totals (should be either zero or one
> tiny file). Or the Linux command on the server of
> du -hx --max-depth=1
> does the same. In either case be in the wpcs*/ level where you see
> the numbers or just in the mslocal/mshold


Checked the mslocal/mshold folder on my gwia server, only files in any
of the folders is a 0 byte file xNStore. Will check other folders as I
have time today.




> If you have the web console to each of the agents running (which is a
> very good idea) start by looking at the MTA(s) and the Links showing.
> I suspect some are closing often enough and if so we need to fix that.


Checking the MTA logs for the MTA that services the GWIA, I do see
several times a day:

10:59:49 F417 GWIA: Gateway now closed
10:59:49 F417 Internet: Domain now closed
10:59:49 F41F GWIA: Gateway now open
10:59:49 F41F Internet: Domain now open

Seems to just bounce and then come right back up.

Also see these entries:

11:05:26 F49E MTP: Waiting for busy LISTEN socket to become available.

This mean the GWIA is just too busy, or maybe needs some config
tweaking?




> We would have to look at a selection of those errors to tell exactly
> what is happening. One suspicion is that some of those messages GWAVA
> can tell right away are a problem and trigger the reject mid steam
> which would have GWIA telling the user. Once the message is all the
> way over to GWAVA, then it would appear as GWAVA notifying.


On of them received one time was:

The message that you sent was undeliverable to the following:

Address@outbounddomain.com (503 Sequence)


Information about your message:
Subject: Test
GroupWise Message Id: 52FBE0EA.4B2:244:48050
Message log tag: 1372050
Number of send attempts: 1
Time of initial send attempt: 02-12-14 14:00:56

Possibly truncated original message follows:
Received: from campgwia-MTA by <our MX record>
with Novell_GroupWise; Wed, 12 Feb 2014 14:00:55 -0700
Message-Id: <52FB7E7A020000F4000A74B2@our_MX_record>
X-Mailer: Novell GroupWise Internet Agent 12.0.2
Date: Wed, 12 Feb 2014 14:00:26 -0700
From: "Me" <my_email@ourdomain>
To: <Address@outbounddomain.com>
Subject: Test
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline


--
Stevo
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: Some email not going out

Andy Konecny sounds like they 'said':

> We will need to look for stuck files in any of the wpcsin/# or
> wpcsout# of all the agents involved, and possible the mslocal/mshold.
> From a Windows client, Total Commander's Alt-Sft-Enter at those
> levels very quickly shows the totals (should be either zero or one
> tiny file). Or the Linux command on the server of
> du -hx --max-depth=1
> does the same. In either case be in the wpcs*/ level where you see
> the numbers or just in the mslocal/mshold


Checked the wpcsin folders on all POA servers as well as both MTA
servers, 0 files.

--
Stevo
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: Some email not going out

In article <sudyv.4061$BB4.1288@novprvlin0913.provo.novell.com>, Stevo
wrote:
> Not quite a basic setup. Have 7 PO's on 7 different servers, one of
> which has the main MTA on it too. 1 server for webaccess, 1 server
> with another MTA for my GWIA.

So 6 POAs that have to travel the network to get to the MTA. Does the
problem happen to just the one POA with the MTA, only the 6 that doesn't,
all the above, or to little data to be sure yet?

> Primary MTA & GWIA MTA are on the same vlan, on virtual servers in the
> same vmware cluster.

How 'far away' are those other POAs? We also have the POA to MTA link to
consider. Beyond a certain distance (WAN links) it becomes a better
practice for them to have their own MTAs.


> Checking the MTA logs for the MTA that services the GWIA, I do see
> several times a day:
> 10:59:49 F417 GWIA: Gateway now closed
> 10:59:49 F417 Internet: Domain now closed
> 10:59:49 F41F GWIA: Gateway now open
> 10:59:49 F41F Internet: Domain now open

Now that is suspicious given they are on the same box.
Internet domain? Is that he name of the Domain for that MTA? Or something
else? Either way very strange and we certainly need look at that box
closely.

> 11:05:26 F49E MTP: Waiting for busy LISTEN socket to become available.
> This mean the GWIA is just too busy, or maybe needs some config
> tweaking?


Either GWIA or its host. So tell me more about this host: RAM (free -m),
CPU (top), Disk use levels (df -h) i.e. Are we pushing any of them too
hard. Also check the network interfaces with ifconfig. If there are a lot
of errors, it could be a failing NIC/cable/port.
Really dig into the different things reporting on the web admin interface,
Status page, for both the MTA and GWIA. Load might just be enough we need
to up some of those thread options, but first to figure out which one(s).
Also possible something is DOSing that box, I've seen where some other
system that has gone defective or misconfigured just pummels the box you
are trying to sort out, so compare packet rates to some of your other
servers.

> The message that you sent was undeliverable to the following:
> Address@outbounddomain.com (503 Sequence)

That was GWAVA effectively saying "I don't like, don't try that again"
Usually that would be seen at the GWAVA level blocking spam coming
inbound, basically saying that your GWIA was acting up.



Andy of
KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: Some email not going out

Andy Konecny sounds like they 'said':

> So 6 POAs that have to travel the network to get to the MTA. Does the
> problem happen to just the one POA with the MTA, only the 6 that
> doesn't, all the above, or to little data to be sure yet?


Too little data to know for sure. The only POA that has the issue that
I've *heard* of is the one on the same box as the MTA.



> How 'far away' are those other POAs? We also have the POA to MTA link
> to consider. Beyond a certain distance (WAN links) it becomes a
> better practice for them to have their own MTAs.


Remote POA servers are 3 hops away, with all being on 1GB connections
sans one. That one is via T-1, but it is a very small POA (6 users),
small enough that if we had a better connection to that site the users
would be migrated to our POA and that POA would go away.



> Now that is suspicious given they are on the same box.
> Internet domain? Is that he name of the Domain for that MTA? Or
> something else? Either way very strange and we certainly need look at
> that box closely.


Internet is the name of the 'domain' containing external users.


> Either GWIA or its host. So tell me more about this host: RAM (free
> -m), CPU (top), Disk use levels (df -h) i.e. Are we pushing any of
> them too hard. Also check the network interfaces with ifconfig. If
> there are a lot of errors, it could be a failing NIC/cable/port.
> Really dig into the different things reporting on the web admin
> interface, Status page, for both the MTA and GWIA. Load might just
> be enough we need to up some of those thread options, but first to
> figure out which one(s). Also possible something is DOSing that box,
> I've seen where some other system that has gone defective or
> misconfigured just pummels the box you are trying to sort out, so
> compare packet rates to some of your other servers.


free-m: (2 GB allocated to it)
total used free shared buffers cached
Mem: 1878 1706 171 0 7 360
-/+ buffers/cache: 1338 540
Swap: 2047 720 1327

top: gwia hovers around 9% of cpu with occasional spikes to 13-20%

df -h:
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 20G 11G 8.1G 57% /
udev 940M 100K 940M 1% /dev
tmpfs 940M 2.1M 938M 1% /dev/shm
admin 4.0M 0 4.0M 0% /_admin
/dev/pool/DATA 5.1G 93M 5.0G 2% /opt/novell/nss/mnt/.pools/DATA
DATA 5.1G 27M 5.0G 1% /media/nss/DATA

Zero errors in ifconfig, but there are some dropped RX packets, less
than 4000 out of almost 19,000,000.


--
Stevo
0 Likes
Highlighted
Absent Member.
Absent Member.

Re: Some email not going out

Andy Konecny sounds like they 'said':

> Either GWIA or its host. So tell me more about this host: RAM (free
> -m), CPU (top), Disk use levels (df -h) i.e. Are we pushing any of
> them too hard. Also check the network interfaces with ifconfig. If
> there are a lot of errors, it could be a failing NIC/cable/port.


Crud, didn't notice you were talking about the host until after I
posted the last reply.

Hosts are x222 nodes in an IBM PureFlex. Each one with 128GB RAM, dual
xeon 2.3GHz, 8 core cpus.

Current host the gwia server resides on is running about 14% cpu util,
and 68% memory util.

Storage pool that houses this VM has over 400GB free.

Both nics on the host have 0 errors and 0 drops.

--
Stevo
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Re: Some email not going out

In article <b59Av.4279$BB4.1739@novprvlin0913.provo.novell.com>, Stevo
wrote:
> Internet is the name of the 'domain' containing external users.

OK, then that makes plenty of sense. Has the problem just affected them,
none of them, or a blend?

Resources on GWIA's host (we don't really care about the meta-host
underneath it on this issue) look good, so they aren't an issue assuming
that the server had been up for more than a few days for that free -m
results.

> Zero errors in ifconfig, but there are some dropped RX packets, less
> than 4000 out of almost 19,000,000.

That might be our problem. While not a high number, that can still trip
things up. One possibility is that the equivalent of packet receive
buffers is too small and that increasing it would help. Alas, my Linux
skills are not yet that deep, but hopefully that gives you a clue to
follow if your MTA and GWIA are communicating via MTP.
What is the Link type between GWIA and its hosting MTA? In the Web admin
interface of that MTA, Links, View Link Configuration. Is the link to
GWIA a folder (/media/nss/DATA/domain/wpgate/gwia ) or an IP address?
Under Configuration for the MTA, what do the "Maximum Inbound TCP/IP
Connections:" show?



Andy of
KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
“i’ve sworn an oath of solitude til the blight is purged from these lands”
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.