Intermittent -626 during high load

Hi,

A customer is seeing this in IDM driver traces:

Code(-9006) The driver returned a "retry" status indicating that the operation should be retried later. Detail from driver: Code(-9011) eDirectory returned an error indicating that the operation should be retried later: novell.jclient.JCException: getEffectivePrivileges -626 ERR_ALL_REFERRALS_FAILED

It might happen during a query for example, during optimize modify.

It seems to happen during high load.

Has anybody a fix for this?

Tags:

  • Ran into this with a customer, it was not enough pool workers threads 

    ndstrace -c threads

    [1] Instance at /etc/opt/novell/eDirectory/conf/nds.conf:  acmeidm0001.OU=servers.OU=services.O=acme.ACMEIDV
    Thread Pool Information
    Summary      : Spawned 429, Died 133
    Pool Workers : Idle 271, Total 296, Peak 296
    Ready Work   : Current 1, Peak 18, maxWait 223041 us
    Sched delay  : Min 4 us, Max 2445098997 us, Avg: 8 us
    Waiting Work : Current 19, Peak 25

     

    This is from a heavy use server that was updated to use a lot more threads than default, and you can see needed it.

     

    nds.conf, setting: 

    n4u.server.max-threads

    The maximum number of threads that will be started by the eDirectory server. This is the number of concurrent operations that can be done within the eDirectory server.

    Default = 64

    Range = 32 to 512

    Refer to the NetIQ eDirectory Troubleshooting Guide to set an optimum value.

     

    Or else use ndsconfig to set it, but it writes it to that file anyway.

     

  • To add on Geoff's post:

    whenever pool workers peak gets in the range of "n4u.server.max-threads" it's time to raise the latter's value. Personally i never have it below 256 (as it's mostly OES for me where interaction with file system comes to the field), if the peak is at let's say 200 i'd set the max to 512.

     

  • Thank you both, increased to 512, we'll what happens next.

  • I am seeing the same issue with one of our IDM server

    Why this is causing and what is official solution?

    Can someone please translate these  (ndstrace -c threads) values to a   IDM developer in simple language ?

     

    Thread Pool Information
    Summary : Spawned 4602, Died 4563
    Pool Workers : Idle 11, Total 39, Peak 51
    Ready Work : Current 1, Peak 22, maxWait 844774 us
    Sched delay : Min 2 us, Max 1536735 us, Avg: 1 us
    Waiting Work : Current 17, Peak 21

     

    /Maqsood.

     

     

     

  • The line to watch is Pool Workers.  With yours at 11, 39, and 51, that is pretty low, and unlikely to be the issue here.

    The ready work line is a bit worrisome, with the maxWait of 844774 microseconds.  That seems a bit high.  I would imagine that is a slower disk array possibly?  

  •    int terms of IDM drivers

     

    What is value in  Pool Worker shows for IDM drivers?

    What is max wait time for IDM drivers? 

     

    We have had platform in many years never seen this error before,  think we are good with disks, this server has many IDM drivers which uses remoteloader .

     

    We recently added 12 Cores to each of IDM servers ( we have 3 of them)-

     

    Could it be related to DXQueue? (just asking)

     

    /Maqsood.

  • We had a SR open and I think we found the root cause. Don’t know if I’m allowed to post it here but it was very surprising.
  •   Can you provide some background info of that SR in private message and root cause, We can also submit similar SR if that would be affecting us in same way being same root cause.

    -Maqsood.

  • Unless it is a security issue, I am not sure that any bugs you find with Supports help are confidential, unless they specifically ask you not to say so.

    Now you have our interests piqued!

  • Maqsood: The simple way to do this with no compromise of info, ask Alekz for the SR reference or a bug # not the details.  That can get your support person to look up the result and decide if they will tell you.