Our vBulletin migration is complete.
Welcome vBulletin users! All content and user information from the Micro Focus Forums (vBulletin) site has been migrated to this site. READ MORE.
mdemel Absent Member.
Absent Member.
736 views

eDirectory hanging at random times.

I've a SLES 11 SP4, OES 2015.1 system that has been giving me trouble for a while. eDirectory will stop responding at random times, I can stop the service via command (rcndsd stop). This process can take up to ten minutes or so to complete, the log file says this "WARNING: ndsd process is still running. Killing ndsd." I can start it back up and everything is good. It can run for a couple of months or the next day it will stop again. Over time it has become more frequent (every 4 to 6 months to every week or two) so I put in a cron job to restart eDirectory every weekend and this has prevented most problems however it can restart one weekend and the following Monday or Tuesday it might stop responding again.
I also have noticed over time that when eDirectory is starting to misbehave, login times start increasing for some people, others it seems to work ok and a lot people can not login at all. I could always tell in the past when looking at the server processes (top command) ndsd would eat alot of CPU. I've have noticed recently though when this problem occurs the CPU usage by ndsd is really not that bad.


We only have about 1200 users and 700 workstations. The system is the master replica, a VM with 4 threads and 11GB ram. This was an issue even before it was converted from a bare metal system to a VM a few years ago. This server is in a replica read write ring with 3 other servers and only has DS, LDAP, Storage Manager and iManager running on it. The other servers have never had an issue, they also have printers (iPrint), Storage manager, and NSS volumes for our users. Ndsd seems to be pretty quiet most of the time and will have relatively high CPU usage every once and a while for a short period so I don't think its a load issue.
We have a Zenworks 2018 system, an iBoss web filter, a Groupwise system and a few other systems that all hit this server. The "other systems" really don't have very many transactions. By far the most frequent hits are from Zenworks and our iBoss web filter. I used an LDAP trace from this document 7007106 to get some info.

I may have went overboard with the description but I'm at loss as to what the problem is and where to look. I'm also not very good a troubleshooting eDirectory or LDAP issues, it tends to go over my head for the most part. It also makes it very hard to narrow down what or where the problem is because of the seemingly random nature of the issue.

Thanks for any help or direction you can give,
Michael
Labels (2)
Tags (3)
0 Likes
6 Replies
Knowledge Partner
Knowledge Partner

Re:eDirectory hanging at random times.

mdemel Wrote in message:

> I've a SLES 11 SP4, OES 2015.1 system that has been giving me trouble
> for a while. eDirectory will stop responding at random times, I can stop
> the service via command (rcndsd stop). This process can take up to ten
> minutes or so to complete, the log file says this "WARNING: ndsd process
> is still running. Killing ndsd." I can start it back up and everything
> is good. It can run for a couple of months or the next day it will stop
> again. Over time it has become more frequent (every 4 to 6 months to
> every week or two) so I put in a cron job to restart eDirectory every
> weekend and this has prevented most problems however it can restart one
> weekend and the following Monday or Tuesday it might stop responding
> again.
> I also have noticed over time that when eDirectory is starting to
> misbehave, login times start increasing for some people, others it seems
> to work ok and a lot people can not login at all. I could always tell in
> the past when looking at the server processes (top command) ndsd would
> eat alot of CPU. I've have noticed recently though when this problem
> occurs the CPU usage by ndsd is really not that bad.
>
>
> We only have about 1200 users and 700 workstations. The system is the
> master replica, a VM with 4 threads and 11GB ram. This was an issue even
> before it was converted from a bare metal system to a VM a few years
> ago. This server is in a replica read write ring with 3 other servers
> and only has DS, LDAP, Storage Manager and iManager running on it. The
> other servers have never had an issue, they also have printers (iPrint),
> Storage manager, and NSS volumes for our users. Ndsd seems to be pretty
> quiet most of the time and will have relatively high CPU usage every
> once and a while for a short period so I don't think its a load issue.
> We have a Zenworks 2018 system, an iBoss web filter, a Groupwise system
> and a few other systems that all hit this server. The "other systems"
> really don't have very many transactions. By far the most frequent hits
> are from Zenworks and our iBoss web filter. I used an LDAP trace from
> this document 7007106 to get some info.
>
> I may have went overboard with the description but I'm at loss as to
> what the problem is and where to look. I'm also not very good a
> troubleshooting eDirectory or LDAP issues, it tends to go over my head
> for the most part. It also makes it very hard to narrow down what or
> where the problem is because of the seemingly random nature of the
> issue.
>
> Thanks for any help or direction you can give,


I see that your originally posted this issue in the eDirectory
Linux forum @
https://forums.novell.com/showthread.php/510891 but have been
redirected here.

Since you noted that "This was an issue even before it was
converted from a bare metal system to a VM a few yearsago" can I
please ask whether this server was originally installed as
OES2015 SP1 (on SLES11 SP4) or has it been upgraded from/through
previous release(s)?

Is the server up to date regarding patches?

HTH.
--
Simon Flood
Micro Focus Knowledge Partner


----Android NewsGroup Reader----
http://usenet.sinaapp.com/
0 Likes
mdemel Absent Member.
Absent Member.

Re: eDirectory hanging at random times.

Its been upgraded over the years. I think it was originally a OES 2 server but it could have been a Netware 6.5 SP8 system, it been a while.


Michael
0 Likes
Knowledge Partner
Knowledge Partner

Re: eDirectory hanging at random times.

mdemel wrote:

>
> Its been upgraded over the years. I think it was originally a OES 2
> server but it could have been a Netware 6.5 SP8 system, it been a
> while.
>
>
>
> Michael


Hi Michael,

I can pretty much guarantee it wasn't a NetWare system. NetWare can't
be upgraded to Linux.

Someone would have had to create an new OES/Linux system then migrate
services and data from the NetWare server.

--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below this post.
Thank you.
_____
Kevin Boyle - Knowledge Partner - Calgary, Alberta, Canada
Who are the Knowledge Partners?
If you appreciate my comments, please click the Like button.
If I have resolved your issue, please click the Accept as Solution button.
0 Likes
Knowledge Partner
Knowledge Partner

Re: eDirectory hanging at random times.

In article <x1O0E.1063$h_7.636@novprvlin0913.provo.novell.com>, Simon
Flood wrote:
> By far the most frequent hits are from Zenworks
> and our iBoss web filter


sounds similar enough to and issue I had with an OES2 box that was the
target of WebSense LDAP where the number of namcd proc would
occasionally max out from all ID/IP matching.

To check the number of those use the command of
ls /proc/`ps -ea |grep namcd | cut -c1-5 |xargs`/fd/ | wc -l
test for on fresh boot vs day to day vs when there is a problem

I created a script to monitor it and restart namcd when its proc got
past 400 and that worked well for us. I also added to have it clear
'Not Logged In' (NLI) connections when they got past 2000 as that is
still an issue with OES2018 with WebSense hitting it, and too many of
those connections do slow a system down.

If you use the script, there are 2 things you might want to change
- if you want NLI connections cleared at a different threshold than
2000, edit the "if [ "$NLICOUNT" -gt "2000" ]" line
- near the bottom you should change out the email addresses in the
mailto line to something your server can get to, or comment out the
mailto line

I dropped this into /etc/cron.hourly/ncpnli.sh which appeared to work
well enough.


#!/bin/bash
# created 2017-07-17 by Andy Konecny of Konecny Consulting. Last
updated 2019-02-21
# track now many NOT LOGGED IN there are by time, along with other bits
added since
# Also checks if processes for namcd are collecting and restarts it if
they are too high

if [ -f /var/opt/novell/log/ncpnli.csv ]; then
echo "log already created"
else
echo "new log creation"
echo "Date, NLI, namcd proc, Open Files, Max Open Files, NetSat-
plan, NetStat-tanpu" > /var/opt/novell/log/ncpnli.csv
fi

DATE=`date +%Y-%m-%d:%H:%M:%S`
CNTCOUNT=`ncpcon connection |grep Used | cut -f3`
NLICOUNT=`ncpcon connection |grep Logged | cut -f3`
#PSNAM=`ps -ea |grep namcd | cut -c2-5`
#PSNAMCT= `ls /proc/$PSNAM/fd/ |wc -l`
#ls /proc/$PSNAM/fd/ |wc -l >$PCTEST
PSNAMCT=$(ls /proc/`ps -ea |grep namcd | cut -c1-5 |xargs`/fd/ | wc -l)

echo "$DATE, $NLICOUNT, $PSNAMCT, `cat /proc/sys/fs/file-nr | cut -f1`,
`cat /proc/sys/fs/file-max`, `netstat -plan | wc -l`, `netstat -tanpu
|wc -l`" >>/var/opt/novell/log/ncpnli.csv
echo "$CNTCOUNT NCP connection slots being used"
echo "$NLICOUNT Not Logged In NCP connections found"
if [ "$NLICOUNT" -gt "2000" ] ; then
echo " that is too many Not Logged In NCP connections. Now doing
something about it."
ncpcon connectino clearNLI
echo "now that should be clearer"
fi
#echo "$PSNAM, $PSNAMCT, $PSTEST"
#cat /var/opt/novell/log/ncpnli.txt

if [ "$PSNAMCT" -gt "400" ] ; then
echo "$PSNAMCT is greater than 400 processes for namcd process.
logging and restarting it"
echo Gathering the state of the server
date >/var/opt/novell/log/namcdres.txt
uptime >>/var/opt/novell/log/namcdres.txt
echo >>/var/opt/novell/log/namcdres.txt
free -m >>/var/opt/novell/log/namcdres.txt
echo >>/var/opt/novell/log/namcdres.txt
echo "There are $PSNAMCT open file descriptors for namcd on $HOSTNAME"
>>/var/opt/novell/log/namcdres.txt

echo >>/var/opt/novell/log/namcdres.txt
gstack `ps -ea |grep namcd |cut -c1-5 |xargs`
>>/var/opt/novell/log/namcdres.txt

echo >>/var/opt/novell/log/namcdres.txt
mail -s "There are $PSNAMCT OFDs on $HOSTNAME, restarting namcd" -a
/var/opt/novell/log/namcdres.txt
akonecny@mail.private.domain.com,admin@ml2.private.domain.com
atd
rcnamcd restart
echo "sh /etc/cron.hourly/ncpnli.sh" | at now + 5 min
fi

exit




Andy of
http://KonecnyConsulting.ca in Toronto
Knowledge Partner
https://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
Knowledge Partner
Knowledge Partner

Re: eDirectory hanging at random times.

Andy Konecny wrote:

> I created a script to monitor it and restart namcd when its proc got
> past 400 and that worked well for us.


You might like to add it to this list of other Cool Tools...
https://www.novell.com/communities/coolsolutions/cool_tools/

--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below this post.
Thank you.
_____
Kevin Boyle - Knowledge Partner - Calgary, Alberta, Canada
Who are the Knowledge Partners?
If you appreciate my comments, please click the Like button.
If I have resolved your issue, please click the Accept as Solution button.
0 Likes
Knowledge Partner
Knowledge Partner

Re: eDirectory hanging at random times.

In article <oh2eE.3421$C62.1420@novprvlin0914.provo.novell.com>, Kevin
Boyle wrote:
> You might like to add it to this list of other Cool Tools...
> https://www.novell.com/communities/coolsolutions/cool_tools/


It is already on that all too long list of things to do for that list.
As part of that I am exploring GitHub as a place to keep them, allowing
them to be updated more readily moving forward, only to have some
Lithium added to our e-diet that shows another option coming soon.


Andy of
http://KonecnyConsulting.ca in Toronto
Knowledge Partner
https://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.