Welcome Serena Central users! CLICK HERE
The migration of the Serena Central community is currently underway. Be sure to read THIS MESSAGE to get your new login set up to access your account.
Highlighted
Anonymous_User Absent Member.
Absent Member.
332 views

ndsd cpu load 250%+


So I am getting reports of timeouts when trying to autenticate against
one of our eDirectory servers. We have two servers set up, both VMs. We
intended to do a round robin but due to issues we mostly just split all
the services. This one server lately is consistently over 250% CPU load.
As I type this it is actually over 600%! What could be causing this
issue?


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

Labels (1)
0 Likes
12 Replies
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+

What is the application doing, exactly ans specifically? If you have a
poorly-written application that is hitting the server hundreds of times
per second, that could do it. If you have an application that is trying
to do a subtree search for some attribute and you have not defined an
index on it, that may also cause a problem. The first stop is to find out
what the box is doing and often tracing LDAP is a good place to do that.


#set LDAP tracing options properly
ldapconfig set 'LDAP Screen Level=all'

#Run ndstrace to capture data to a file.
ndstrace
set dstrace=nodebug
dstrace +time +tags +ldap
set dstrace=*m9999999
dstrace file on
set dstrace=*r
#wait for a second here o capture data.
dstrace file off
quit


Post the (by default) /var/opt/novell/eDirectory/log/ndstrace.log file and
let's see what is happening.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+


It stopped before I had a chance to try. I will try your suggestion if
it happens again.


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+


So this just started happening again. I restarted the service and the
load immediately went critical again:


Code:
--------------------
top - 13:59:56 up 104 days, 13:51, 3 users, load average: 18.72, 16.61, 11.57
Tasks: 279 total, 1 running, 278 sleeping, 0 stopped, 0 zombie
Cpu(s): 92.8%us, 5.5%sy, 0.0%ni, 1.5%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 16249880k total, 16023036k used, 226844k free, 432292k buffers
Swap: 8388600k total, 134056k used, 8254544k free, 7064712k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4698 root 20 0 8085m 7.0g 26m S 1216.6 45.1 3717:53 ndsd
24778 anglarma 20 0 15172 1388 952 R 6.9 0.0 0:01.59 top
51 root 20 0 0 0 0 S 0.3 0.0 7:32.74 events/0
56 root 20 0 0 0 0 S 0.3 0.0 3:33.32 events/5
57 root 20 0 0 0 0 S 0.3 0.0 3:59.70 events/6
83 root 20 0 0 0 0 S 0.3 0.0 1:23.92 kblockd/1
94 root 20 0 0 0 0 S 0.3 0.0 0:07.19 kacpid
2064 root 20 0 326m 15m 12m S 0.3 0.1 1:07.53 EvMgrC
7025 novlwww 20 0 771m 455m 6728 S 0.3 2.9 133:03.26 java
1 root 20 0 19356 524 316 S 0.0 0.0 0:09.48 init
2 root 20 0 0 0 0 S 0.0 0.0 0:01.38 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:01.38 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:15.86 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
6 root RT 0 0 0 0 S 0.0 0.0 0:10.14 watchdog/0
7 root RT 0 0 0 0 S 0.0 0.0 0:01.34 migration/1
8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1

--------------------


I set the trace to only show LDAP and this is what I got:


Code:
--------------------
09/23/2014
14:19:06 84475700 LDAP: BIO ctrl called with unknown cmd 7
14:19:08 80233700 LDAP: BIO ctrl called with unknown cmd 7
14:19:11 86293700 LDAP: BIO ctrl called with unknown cmd 7
14:19:14 87BAC700 LDAP: BIO ctrl called with unknown cmd 7
14:19:14 86E9F700 LDAP: BIO ctrl called with unknown cmd 7
14:19:14 871A2700 LDAP: BIO ctrl called with unknown cmd 7
14:19:14 86697700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 86394700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 80637700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 84B7C700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 87AAB700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 86FA0700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 86D9E700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 82657700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 84172700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 88CBD700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 9F17B700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 83162700 LDAP: BIO ctrl called with unknown cmd 7
14:19:15 9A7B0700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 9448F700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 84475700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 886B7700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 81041700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 86495700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 87BAC700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 81445700 LDAP: BIO ctrl called with unknown cmd 7
14:19:16 85F90700 LDAP: BIO ctrl called with unknown cmd 7
14:19:17 883B4700 LDAP: BIO ctrl called with unknown cmd 7
14:19:17 99F98700 LDAP: BIO ctrl called with unknown cmd 7
14:19:18 82859700 LDAP: BIO ctrl called with unknown cmd 7
14:19:18 8598A700 LDAP: BIO ctrl called with unknown cmd 7
14:19:18 95890700 LDAP: BIO ctrl called with unknown cmd 7
14:19:18 85C8D700 LDAP: BIO ctrl called with unknown cmd 7
14:19:19 82859700 LDAP: BIO ctrl called with unknown cmd 7
14:19:19 82E5F700 LDAP: BIO ctrl called with unknown cmd 7
14:19:20 86C9D700 LDAP: BIO ctrl called with unknown cmd 7
14:19:21 82D5E700 LDAP: BIO ctrl called with unknown cmd 7
14:19:22 81748700 LDAP: TLS accept failure 5 on connection 0x44490380, setting err = -5875. Error stack:
14:19:22 81748700 LDAP: TLS handshake failed on connection 0x44490380, err = -5875
14:19:23 82051700 LDAP: TLS accept failure 5 on connection 0x44490380, setting err = -5875. Error stack:
14:19:23 82051700 LDAP: TLS handshake failed on connection 0x44490380, err = -5875
14:19:41 874A5700 LDAP: BIO ctrl called with unknown cmd 7
14:19:42 86293700 LDAP: TLS accept failure 5 on connection 0x430ee000, setting err = -5875. Error stack:
14:19:42 86293700 LDAP: TLS handshake failed on connection 0x430ee000, err = -5875
14:19:43 9F983700 LDAP: TLS accept failure 5 on connection 0x430ee000, setting err = -5875. Error stack:
14:19:43 9F983700 LDAP: TLS handshake failed on connection 0x430ee000, err = -5875
14:19:53 84879700 LDAP: BIO ctrl called with unknown cmd 7
14:19:53 85485700 LDAP: BIO ctrl called with unknown cmd 7
14:20:02 80A3B700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:20:02 80A3B700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:20:03 8A4ED700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:20:03 8A4ED700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:20:13 81344700 LDAP: BIO ctrl called with unknown cmd 7
14:20:13 85889700 LDAP: BIO ctrl called with unknown cmd 7
14:20:14 9EA74700 LDAP: BIO ctrl called with unknown cmd 7
14:20:14 874A5700 LDAP: BIO ctrl called with unknown cmd 7
14:20:17 86697700 LDAP: BIO ctrl called with unknown cmd 7
14:20:17 81748700 LDAP: Failed to authenticate local on connection 0x436a6000, err = failed authentication (-669)
14:20:17 85B8C700 LDAP: BIO ctrl called with unknown cmd 7
14:20:17 883B4700 LDAP: Failed to authenticate local on connection 0x434ed180, err = failed authentication (-669)
14:20:20 86899700 LDAP: Failed to authenticate local on connection 0x430ee380, err = failed authentication (-669)
14:20:22 87AAB700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:20:22 87AAB700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:20:23 84E7F700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:20:23 84E7F700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:20:23 83667700 LDAP: BIO ctrl called with unknown cmd 7
14:20:24 877A8700 LDAP: BIO ctrl called with unknown cmd 7
14:20:27 86495700 LDAP: Failed to authenticate local on connection 0x430ee000, err = failed authentication (-669)
14:20:30 83B6C700 LDAP: BIO ctrl called with unknown cmd 7
14:20:30 82051700 LDAP: BIO ctrl called with unknown cmd 7
14:20:31 82F60700 LDAP: BIO ctrl called with unknown cmd 7
14:20:33 80E3F700 LDAP: Failed to authenticate local on connection 0x430ee000, err = failed authentication (-669)
14:20:42 9448F700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:20:42 9448F700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:20:43 83667700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:20:43 83667700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:21:00 86B9C700 LDAP: BIO ctrl called with unknown cmd 7
14:21:00 874A5700 LDAP: BIO ctrl called with unknown cmd 7
14:21:02 82657700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:21:02 82657700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:21:03 9578F700 LDAP: TLS accept failure 5 on connection 0x430ee380, setting err = -5875. Error stack:
14:21:03 9578F700 LDAP: TLS handshake failed on connection 0x430ee380, err = -5875
14:21:04 83869700 LDAP: Failed to authenticate local on connection 0x430ee000, err = failed authentication (-669)
--------------------



I probably have to set some more trace options but I really don't know
which. It seems to have subsided by itself for now.


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+

On Tue, 23 Sep 2014 20:26:47 +0000, bobbintb wrote:

> So this just started happening again. I restarted the service and the
> load immediately went critical again:


I'm not convinced this is a problem internal to eDirectory. Tracking down
misperforming clients is always a pain, but I think that's what you're
looking for here.


> 14:19:22 81748700 LDAP: TLS accept failure 5 on connection 0x44490380,

setting err = -5875.

You have a bunch of these, which basically indicates that a client
talking to your server, and your server, can't agree on the SSL layer.
That's most likely a client problem if your server is otherwise working
normally.

> 14:20:20 86899700 LDAP: Failed to
> authenticate local on connection 0x430ee380, err = failed
> authentication (-669)


Then you have a bunch of these. -669 is normally what you see for username
or password is wrong.

So given this it looks to me like something is hammering on your server
attempting first to get a working SSL connection, then trying to log in
with an invalid DN or password.


> I probably have to set some more trace options but I really don't know
> which. It seems to have subsided by itself for now.


I'm curious why you're not seeing the DN of the attempted authentication.
You're also not seeing the IP address the connections are coming from.
You need to go to the LDAP Server object, find the trace options tab, and
enable everything there that isn't "packet dump".


--
--------------------------------------------------------------------------
David Gersic dgersic_@_niu.edu
Knowledge Partner http://forums.netiq.com

Please post questions in the forums. No support provided via email.
If you find this post helpful, please click on the star below.
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+


dgersic;249108 Wrote:
> On Tue, 23 Sep 2014 20:26:47 +0000
> I'm curious why you're not seeing the DN of the attempted
> authentication.
> You're also not seeing the IP address the connections are coming from.
> You need to go to the LDAP Server object, find the trace options tab,
> and
> enable everything there that isn't "packet dump".
>
>
> --
> --------------------------------------------------------------------------
> David Gersic
> dgersic_@_niu.edu
> Knowledge Partner
> http://forums.netiq.com
>
> Please post questions in the forums. No support provided via
> email.
> If you find this post helpful, please click on the star below.


Ok, it looks like only Critical Error Messages and Non-critical Error
Messages was selected. It looks like it's been about 104 days since this
last happened. I guess I will have to wait and see when it happens again
but I have a good idea of the culprit. One of our systems was reporting
slowdown issues when this started. As I mentioned earlier, I restarted
the service and it didn't help. As soon as I told the admin to move over
to another server server load went back to normal, although it did not
cause any issue on the server it was moved to.


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+


This started happening again today and I set the LDAP trace as mentioned
and got a good trace. Can I send it to one or both of you to look at as
it has IP and usernames I'd rather not post? I'm pretty certain I have
identified the culprit but I still don't know what it is actually doing.


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+

Compress please; ab at novell.com

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+

Based on what I can see I would make sure you have value (vs. presence or
substring) indexes on the following attributes on this server, as well as
any others answering these same LDAP queries from various applications:

objectClass - Not a default, but IMO should be on every server
CN - Usually present already, but double-check
uniqueID - Usually present already, but double-check
gidNumber
memberUid
member

This query, in particular, is taking a long, long time to return:

"(&(objectClass=posixGroup)(gidNumber=1234))"

Be sure that at least objectClass is indexed, and preferably gidNumber as
well. This search is also taking a very long time to return:

"(&(objectClass=posixGroup)(memberUid=something-here))"

As a result, be sure that besides objectClass you get memberUid.

Let us know if that helps. Indexes add overhead, but generally if you add
them based on queries that are happening and are slow (as shown above)
they are well worth the memory and processing overhead, which is usually
not noticed.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+


I will look at the indexes and see if that helps but there is something
else I noticed which might be relevant. Talking to the admin of the
system in question and your response about the indexes led me to look
into how groups are set up. When browsing the group objects in iManager
I get the following error, especially on the "Dynamic" tab:


Code:
--------------------
LDAP Error

Unable to obtain a valid LDAP context.

Creating secure SSL LDAP context failed:
Invalid name: /:636
--------------------


I did a quick search and it looks like this error is related to a
certificate. In addition to the indexes, could a bad certificate be
compounding the issue, or is this totally off base?


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+

I would bet this is related to being on eDirectory 8.8 SP8, assuming you
are. Be sure iManager is completely patched to the latest available, and
if that does not help try adding values to the ldapInterfaces attribute on
the LDAP Server object. iManager or 'ldapconfig' can set this for you,
butthe defaults changed with 8.8 SP8 so perhaps that is making a difference.

In any case, those queries I called out before are insanely slow and
should be fixed first.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+


Well, a little update. I waited to add those indexes until it happened
again just to make sure. Only 3 of them weren't indexed: object class,
gidnumber, and memberuid. I watched the CPU load while indexing took
place. gidnumber and memberuid finished and CPU load was still around
500-600% for that process. The moment object class finished indexing CPU
load plummeted to normal levels so I am cautiously optimistic that was
the problem. Growing pains on my part. Thanks for the help.


--
bobbintb
------------------------------------------------------------------------
bobbintb's Profile: https://forums.netiq.com/member.php?userid=5629
View this thread: https://forums.netiq.com/showthread.php?t=51328

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: ndsd cpu load 250%+

Thanks for the reply. Yes, objectClass value indexes should probably be a
default, but still are not. Feel free to let NetIQ know that the old bug
(404629) is still needing to be resolved.

--
Good luck.

If you find this post helpful and are logged into the web interface,
show your appreciation and click on the star below...
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.