OES23.4 ERROR: Could not connect to server, Error : -602

After upgrading servers without replica will get after some time (10-120 minutes) with ndsrepair -T in the state 'ERROR: Could not connect to server, Error : -602' on all other servers.

After a networkrepair everything ist ok for some time.

With OES2018SP3 we didn't have this issues.

Some ideas? Is there some change in the replica distribution design (We have it only on 3 main servers)?

Best regards

Martin

  • On something like

    slptool -u xx.xx.xx.xx findsrvs service:bindery.novell

    with xx.xx.xx.xx being the IP of a configured DA, do you get ALL NCP servers listed?

  • Checked DA. The server is visible on both DA-Servers. slptool deliver the list of all ncp-servers.

  • The strange thing is that you weren't facing this before. IIRC, "ndsrepair" references the "status" attribute of the remote box in the first place. This attribute is not populated on XREF (nor one of the "cached-on-xref" ones), hence the "602" (no such attribute). All in all (as name resolution as such works) i don't see a reason to worry.

    But there has been a thread (at least one) around referencing exactly this behaviour (602 on xref servers after upgrading to OES2023).

  • I'm seeing this as well. 
    On a system with mostly OES 2018.3, a few OES2023, and one lone OES 2015 pending removal, I see exactly this on the one(so far) 2023 box without a replica.
    On the suspicion that I might not have the new firewall fully figured it out, I dropped it for a while, but no change.

    Both of the following match:
       slptool -u xx.xx.xx.xx findsrvs service:bindery.novell |sort
       slptool findsrvs service:bindery.novell |sort

    ndsrepair -N worked very fast to find all the addresses, and now no errors on the ndsrepair -T     at least for now.  I will be monitoring.

    eDir is otherwise nice and healthy, and has been kept that most of its life. It started at NW 4.11, I started working with them leading into the NW 5 migration. CA is still SHA1 signed, but that rebuild is a part of this Xmas break's tasks. All this in case there is something hanging in older tree's that might be a source.

    Edit:  and the -602 error for all the servers came back on that box, in under an hour, even with the firewall set to off. 

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • I opened a support case back in March of this year with the eDirectory team.  I originally saw this on our OES2023 cluster servers without a replica.  Support stated that this is more of a cosmetic issue and does not affect the operation of eDir/OES.  Currently I see this issues on all of our 2023 servers without replicas(20+).  Even though  a ndsrepair -N does resolve the issue, it is only for a short period.  We have not seen any issues with our eDirectory in the current state, all operations can be done without issues.

    HTH

  • Unfortunately, that KB does NOT address the change in behaviour since it was last updated.

    While it is a reasonable description for the occasional connection, since OES 2023 it appears to be EVERY instance hits that in under an hour from what we are seeing. That is a Large delta from before that should be looked at in case there is an underlying problem that is fermenting and may hit us hard later.  

    oh, the OES monthly webinar set for tomorrow, where it will be a good topic. Let's see who brings it up first ;)

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • The CA upgrade including recreating a new tree key is also on my list of tasks to do in the next months. If you have a DSfW server in your tree I would be very interested in your results and how you did it. I already tried to do this some time ago on OES 2018.3 and terribly failed - because kerberos crashed after the upgrade - although the documentation said, that the kerberos version of OES 2018.3 is able to handle those keys and certificates. I had to rebuild all replica servers from backup to the state before the CA and key upgrade to get kerberos authentication up again. Opening a support ticket was no option, because this would have meant, that all resources requiring AD(DSfW) authentication are not accessible to anyone, till support has solved the problem.

  • Ouch.  Was that the Kerberos bits on the OES box or the AD side?

    One of the trees I'm doing this Xmas break doesn't even have AD, so no issue there. The other already in the queue does have an old OES 11 box called dsfw that was part of a previous attempt to connect, that now you will have me looking much closer at to make sure it isn't an issue at this client that every bit of work uncovers new 'fun'.

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • It was the Kerberos on the OES (DSfW) box.