Anonymous_User Absent Member.
Absent Member.
528 views

eDir driver connections getting stuck in CLOSE_WAIT state


I have a site that has a firewall between their Identity Vault and their
eDir auth tree. I have a traditional eDirectory driver setup between
the two trees. For some reason, after a while, I'll start seeing tons
of CLOSE_WAIT connections on one side and the driver will stop
functioning. A driver restart fixes it and things start flowing again.


Obviously, it's easy to blame the network/firewall (as support has), but
I can still open TCP connections in either direction when this happens
on the port being used for the eDir driver. It's just that the IdM
engine/driver won't recover.

Support suggested writing a script to change an object on each side
every few minutes to keep the connection alive, which we've done. It
works, but I find this solution somewhat unacceptable.

Has anyone seen behavior like this before? Any ideas how to fix? We
did mess with the publisher timeouts, but that did not seem to make any
difference.

This is eDir 8.8 SP8 and IdM 4.0.2 with engine patch 4.

The other big difference here is this is on RedHat. So I'm not sure if
that would play any factor here at all.

Thanks for any suggestions.

Matt


--
matt
------------------------------------------------------------------------
matt's Profile: https://forums.netiq.com/member.php?userid=183
View this thread: https://forums.netiq.com/showthread.php?t=49883

Labels (1)
0 Likes
15 Replies
cpedersen Outstanding Contributor.
Outstanding Contributor.

Re: eDir driver connections getting stuck in CLOSE_WAIT state

Kind of a know issue.

- do you see the issue from both sides ?
- is the firewall a stateful firewall ?

One of the major issues with firewalls is that they decide to do things
which work against connections which are open for a long time - for
example with the edir2edir driver, when it starts up it will open a
connection to the other side, the port is forwarded by the firewall and
then monitored. Now after a while you might not have any sync going on,
and after a while the firewall might think something dodgy is wrong, and
decide to kill the connection ... driver starts having problems...

I have seen your issue before, and the way I've figured out how to fix
is to use a job which just does a search for an object on the
destination ever 2 - 5 minutes (max 300 seconds), and then do nothing more.

It's not a problem with IDM and it's not a problem with the firewall,
it's just the way it works.

Casper


On 2/3/14, 3:54 PM, matt wrote:
>
> I have a site that has a firewall between their Identity Vault and their
> eDir auth tree. I have a traditional eDirectory driver setup between
> the two trees. For some reason, after a while, I'll start seeing tons
> of CLOSE_WAIT connections on one side and the driver will stop
> functioning. A driver restart fixes it and things start flowing again.
>
>
> Obviously, it's easy to blame the network/firewall (as support has), but
> I can still open TCP connections in either direction when this happens
> on the port being used for the eDir driver. It's just that the IdM
> engine/driver won't recover.
>
> Support suggested writing a script to change an object on each side
> every few minutes to keep the connection alive, which we've done. It
> works, but I find this solution somewhat unacceptable.
>
> Has anyone seen behavior like this before? Any ideas how to fix? We
> did mess with the publisher timeouts, but that did not seem to make any
> difference.
>
> This is eDir 8.8 SP8 and IdM 4.0.2 with engine patch 4.
>
> The other big difference here is this is on RedHat. So I'm not sure if
> that would play any factor here at all.
>
> Thanks for any suggestions.
>
> Matt
>
>


0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state

On 02/03/2014 08:25 AM, Casper Pedersen wrote:
> Kind of a know issue.
>
> - do you see the issue from both sides ?
> - is the firewall a stateful firewall ?
>
> One of the major issues with firewalls is that they decide to do things
> which work against connections which are open for a long time - for
> example with the edir2edir driver, when it starts up it will open a
> connection to the other side, the port is forwarded by the firewall and
> then monitored. Now after a while you might not have any sync going on,
> and after a while the firewall might think something dodgy is wrong, and
> decide to kill the connection ... driver starts having problems...
>
> I have seen your issue before, and the way I've figured out how to fix is
> to use a job which just does a search for an object on the destination
> ever 2 - 5 minutes (max 300 seconds), and then do nothing more.
>
> It's not a problem with IDM and it's not a problem with the firewall, it's
> just the way it works.


I agree with everything but the last line. It IS a problem with the
firewall. Firewalls that just stop allowing an established connection to
continue working are misconfigured. Assuming there is a need to do this
(and that's a tall assumption, in my opinion, since what's the purpose
exactly unless the timeout for this is really, really high (for example, a
few days, or maybe a week) the firewall should at least tell both sides
that it is closing things by sending a RST-flagged packet to each side.
This way IDM will know that something is no lnoger connected and it can,
if needed (it might not unless an event is coming) reconnect. Otherwise
you end up with exactly your situation. CLOSE_WAIT means that the server
things something must finish sending, and will never show up if the
connection is gone (RST sent to both sides).

Fix your firewall to either allow the connections to last indefinitely or
else to actually close the connections rather than just silently dropping
connections for an otherwise valid connection. Other workarounds are
already listed, and will work as long as the firewall is not closing this
connection based purely on time of the established connection, vs. (as has
been implied) on the time of inactivity. Both types exist, and the
assumed versions are less-evil, but ones that close based on time of an
established connection (regardless of activity level) also exist and are
even more asinine.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state


I agree on the behavior of these firewalls. Makes no sense, especially
for internal traffic. But it is what it is. I'm pushing on that side
too, believe me.

However, I still think IdM should be able to recover. What if something
else caused network interruption? Why can't it recover from this? To
me, this is still somewhat of a bug.

At least we do have a work around though, so I guess it'll be a
stalemate.

Matt

Oh, Update, got this from the customer:

"I think the reason the firewall drops the connection is that it has a
fixed amount of space to store established connection information. The
standard approach is to drop the least recently used connection when a
new one comes in."


ab;240041 Wrote:
> On 02/03/2014 08:25 AM, Casper Pedersen wrote:
> > Kind of a know issue.
> >
> > - do you see the issue from both sides ?
> > - is the firewall a stateful firewall ?
> >
> > One of the major issues with firewalls is that they decide to do

> things
> > which work against connections which are open for a long time - for
> > example with the edir2edir driver, when it starts up it will open a
> > connection to the other side, the port is forwarded by the firewall

> and
> > then monitored. Now after a while you might not have any sync going

> on,
> > and after a while the firewall might think something dodgy is wrong,

> and
> > decide to kill the connection ... driver starts having problems...
> >
> > I have seen your issue before, and the way I've figured out how to fix

> is
> > to use a job which just does a search for an object on the destination
> > ever 2 - 5 minutes (max 300 seconds), and then do nothing more.
> >
> > It's not a problem with IDM and it's not a problem with the firewall,

> it's
> > just the way it works.

>
> I agree with everything but the last line. It IS a problem with the
> firewall. Firewalls that just stop allowing an established connection
> to
> continue working are misconfigured. Assuming there is a need to do this
> (and that's a tall assumption, in my opinion, since what's the purpose
> exactly unless the timeout for this is really, really high (for example,
> a
> few days, or maybe a week) the firewall should at least tell both sides
> that it is closing things by sending a RST-flagged packet to each side.
> This way IDM will know that something is no lnoger connected and it can,
> if needed (it might not unless an event is coming) reconnect. Otherwise
> you end up with exactly your situation. CLOSE_WAIT means that the
> server
> things something must finish sending, and will never show up if the
> connection is gone (RST sent to both sides).
>
> Fix your firewall to either allow the connections to last indefinitely
> or
> else to actually close the connections rather than just silently
> dropping
> connections for an otherwise valid connection. Other workarounds are
> already listed, and will work as long as the firewall is not closing
> this
> connection based purely on time of the established connection, vs. (as
> has
> been implied) on the time of inactivity. Both types exist, and the
> assumed versions are less-evil, but ones that close based on time of an
> established connection (regardless of activity level) also exist and are
> even more asinine.
>
> --
> Good luck.
>
> If you find this post helpful and are logged into the web interface,
> show your appreciation and click on the star below...



--
matt
------------------------------------------------------------------------
matt's Profile: https://forums.netiq.com/member.php?userid=183
View this thread: https://forums.netiq.com/showthread.php?t=49883

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state

On 02/03/2014 12:34 PM, matt wrote:
>
> I agree on the behavior of these firewalls. Makes no sense, especially
> for internal traffic. But it is what it is. I'm pushing on that side
> too, believe me.
>
> However, I still think IdM should be able to recover. What if something
> else caused network interruption? Why can't it recover from this? To
> me, this is still somewhat of a bug.


IDM/eDir, like any application, doesn't have much control at this layer in
most cases. The TCP stack as part of the OS does the connection
management at the request of the application. While it's possible to
build timers into applications, generally that should not be needed.
There is an RFC defining known limitations/weaknesses in TCP and some good
workarounds which mentions a stuck CLOSE_WAIT, but usually this is a
problem when you have a service with a lot of clients (HTTP server for
?example) since those stuck connections can cause a use of server
resources until something crashes. These cases, though, do not have the
added error on the part of routers killing connections without notifying
either party that the line is dead. Since the TCP spec states that the
connection should be torn down gracefully, and since the packets being
sent to do exactly that cannot get through and are being silently (the
problem, vs. RST) dropped, both sides are supposed to (by spec) keep
trying. Doing otherwise means losing data that the application thinks has
been sent to the other side of the connection, and losing data is not
supposed to happen in TCP. If the router performs correctly and sends RST
packets at the time of the connection being killed, or if it sends RST
packets back to the two sides who are trying to talk to eachother, then
the connection is torn down and the application can move on with its retry
logic.

> At least we do have a work around though, so I guess it'll be a
> stalemate.
>
> Matt
>
> Oh, Update, got this from the customer:
>
> "I think the reason the firewall drops the connection is that it has a
> fixed amount of space to store established connection information. The
> standard approach is to drop the least recently used connection when a
> new one comes in."


Um.......... does this sound like a great way to cause a denial of service
on your system? All I need to do is create enough connections that are
quickly torn down to work through the connection table's memory?
Hmmmm................

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
cpedersen Outstanding Contributor.
Outstanding Contributor.

Re: eDir driver connections getting stuck in CLOSE_WAIT state

On 2/3/14, 8:34 PM, matt wrote:
>
> I agree on the behavior of these firewalls. Makes no sense, especially
> for internal traffic. But it is what it is. I'm pushing on that side
> too, believe me.


Security people are not know for being the most flexible people....

> However, I still think IdM should be able to recover. What if something
> else caused network interruption? Why can't it recover from this? To
> me, this is still somewhat of a bug.


This is not the only driver which suffers from this, this is a generic
issue. The same issue also can occur with the Remote Loader. And it's an
issue which would require load of work to change.

The only other functional way to sort something like this, would be to
open a new connection for each operation, which isn't a good idea either...

> At least we do have a work around though, so I guess it'll be a
> stalemate.


I would suggest open an enhancement request, and add business impact,
then the PM will look at it and decide if this should be change at some
point; http://www.novell.com/rms/

> Oh, Update, got this from the customer:
>
> "I think the reason the firewall drops the connection is that it has a
> fixed amount of space to store established connection information. The
> standard approach is to drop the least recently used connection when a
> new one comes in."
>


Nop that is not the case, it's just the way a stateful firewall works, a
connection without traffic on for x number seconds/minutes is a stale
connection which will be taken down.

Ofcause if it's a tiny of overloader firewall then it could occur earlier...

Casper



0 Likes
cpedersen Outstanding Contributor.
Outstanding Contributor.

Re: eDir driver connections getting stuck in CLOSE_WAIT state

Sure if it's something which can be fixed at the firewall, then fix it,
but if not, then use the job option.

Casper


On 2/3/14, 6:09 PM, ab wrote:
> On 02/03/2014 08:25 AM, Casper Pedersen wrote:
>> Kind of a know issue.
>>
>> - do you see the issue from both sides ?
>> - is the firewall a stateful firewall ?
>>
>> One of the major issues with firewalls is that they decide to do things
>> which work against connections which are open for a long time - for
>> example with the edir2edir driver, when it starts up it will open a
>> connection to the other side, the port is forwarded by the firewall and
>> then monitored. Now after a while you might not have any sync going on,
>> and after a while the firewall might think something dodgy is wrong, and
>> decide to kill the connection ... driver starts having problems...
>>
>> I have seen your issue before, and the way I've figured out how to fix is
>> to use a job which just does a search for an object on the destination
>> ever 2 - 5 minutes (max 300 seconds), and then do nothing more.
>>
>> It's not a problem with IDM and it's not a problem with the firewall, it's
>> just the way it works.

>
> I agree with everything but the last line. It IS a problem with the
> firewall. Firewalls that just stop allowing an established connection to
> continue working are misconfigured. Assuming there is a need to do this
> (and that's a tall assumption, in my opinion, since what's the purpose
> exactly unless the timeout for this is really, really high (for example, a
> few days, or maybe a week) the firewall should at least tell both sides
> that it is closing things by sending a RST-flagged packet to each side.
> This way IDM will know that something is no lnoger connected and it can,
> if needed (it might not unless an event is coming) reconnect. Otherwise
> you end up with exactly your situation. CLOSE_WAIT means that the server
> things something must finish sending, and will never show up if the
> connection is gone (RST sent to both sides).
>
> Fix your firewall to either allow the connections to last indefinitely or
> else to actually close the connections rather than just silently dropping
> connections for an otherwise valid connection. Other workarounds are
> already listed, and will work as long as the firewall is not closing this
> connection based purely on time of the established connection, vs. (as has
> been implied) on the time of inactivity. Both types exist, and the
> assumed versions are less-evil, but ones that close based on time of an
> established connection (regardless of activity level) also exist and are
> even more asinine.
>


0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state


Hi Matt,
May be my input will add more confusion, but I would like to add more
information about "similar" issue (potentially they can be related one
to another).
I’m got the issue with traditional eDirectory driver restart after
Directory upgrade to 8.8.8.
eDirectory driver “stack” during stop/restart operation, used port
(8196)didn’t released and this situation prevent driver from start
again.
I have IDM 4.0.2.3, eDirectory 8.8.8 and I had this issue on systems
hosted on both Linux distributions (RedHat and SLES).
I have a plan to check the status after eDirectory upgrade to 8.8.8.1
that was released last week and may be this new eDir patch will fix the
issue.

Alex


--
al_b
------------------------------------------------------------------------
al_b's Profile: https://forums.netiq.com/member.php?userid=209
View this thread: https://forums.netiq.com/showthread.php?t=49883

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state

al b,

Your issue is likely because of Bug# 849111 which is marked as being fixed
in IDM 4.0 SP2 Patch 4 (you're on Patch 3).

On a possibly-relevant note, if you installed eDiretcory 8.8 SP8 and THEN
installed IDM 4.0 SP2 you may need to use the '--force' option with the
eDirectory 8.8 SP8 Patch 1 installer since the IDM installer installs a
few old RPMs from eDir 8.8 SP7 at which the 8.8 SP8 Patch 1 installer then
balks. I've done the --force (-f) option and it seems to be fine, but it
will only affect you if IDM came after eDir 8.8 SP8.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state


Thank you for the update, Aaron!


--
al_b
------------------------------------------------------------------------
al_b's Profile: https://forums.netiq.com/member.php?userid=209
View this thread: https://forums.netiq.com/showthread.php?t=49883

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state


RESOLVED:

I ran into this same issue this week. Our eDir to eDir drivers would
loose connection after only 2 minutes of no activity. They are connected
through a firewall.

We added this config on the Driver in the Subscriber and the Publisher
options, and it solved the problem (See below).

- <definition display-name="Receive timeout in
minutes" name="keep-alive-interval" range-lo ="1" type="integer">
<description>
In order to detect a
lost TCP/IP connection the eDirectory to eDirectory driver periodically
sends
small packets. This
value determines how long since entering a receive-wait condition the
publisher channel will
wait until sending a "keep-alive" packet to determine if the TCP/IP
connection has been
lost. Generally this value should not be changed except under
instruction
from Novell. The default
value for the publisher channel is ten minutes.
</description>
<value> 1</ value>
</definition>

-


--
ckynaston
------------------------------------------------------------------------
ckynaston's Profile: https://forums.netiq.com/member.php?userid=7175
View this thread: https://forums.netiq.com/showthread.php?t=49883

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: eDir driver connections getting stuck in CLOSE_WAIT state


Yep, we see it on both sides. And I'm fairly familiar the firewall is
stateful, but I can verify.

We're using a cron job with an ldap modify basically, but I like your
idea using a search better. But how did you do a job every 5 minutes?
It doesn't seem to like */5 * * * *. I had to use:

0,5,10,15,20,25,30,35,40,45,50,55 * * * *

Matt

Casper Pedersen;240033 Wrote:
> Kind of a know issue.
>
> - do you see the issue from both sides ?
> - is the firewall a stateful firewall ?
>
> One of the major issues with firewalls is that they decide to do things
> which work against connections which are open for a long time - for
> example with the edir2edir driver, when it starts up it will open a
> connection to the other side, the port is forwarded by the firewall and
> then monitored. Now after a while you might not have any sync going on,
> and after a while the firewall might think something dodgy is wrong,
> and
> decide to kill the connection ... driver starts having problems...
>
> I have seen your issue before, and the way I've figured out how to fix
> is to use a job which just does a search for an object on the
> destination ever 2 - 5 minutes (max 300 seconds), and then do nothing
> more.
>
> It's not a problem with IDM and it's not a problem with the firewall,
> it's just the way it works.
>
> Casper
>
>
> On 2/3/14, 3:54 PM, matt wrote:
> >
> > I have a site that has a firewall between their Identity Vault and

> their
> > eDir auth tree. I have a traditional eDirectory driver setup between
> > the two trees. For some reason, after a while, I'll start seeing

> tons
> > of CLOSE_WAIT connections on one side and the driver will stop
> > functioning. A driver restart fixes it and things start flowing

> again.
> >
> >
> > Obviously, it's easy to blame the network/firewall (as support has),

> but
> > I can still open TCP connections in either direction when this

> happens
> > on the port being used for the eDir driver. It's just that the IdM
> > engine/driver won't recover.
> >
> > Support suggested writing a script to change an object on each side
> > every few minutes to keep the connection alive, which we've done. It
> > works, but I find this solution somewhat unacceptable.
> >
> > Has anyone seen behavior like this before? Any ideas how to fix? We
> > did mess with the publisher timeouts, but that did not seem to make

> any
> > difference.
> >
> > This is eDir 8.8 SP8 and IdM 4.0.2 with engine patch 4.
> >
> > The other big difference here is this is on RedHat. So I'm not sure

> if
> > that would play any factor here at all.
> >
> > Thanks for any suggestions.
> >
> > Matt
> >
> >



--
matt
------------------------------------------------------------------------
matt's Profile: https://forums.netiq.com/member.php?userid=183
View this thread: https://forums.netiq.com/showthread.php?t=49883

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.