How to delete doubled NCP server object?

Hi all,

Somehow i have the NCP server object of my master eDir server doubled (one object "myserver" which is correct and the other

object with name "0_6" which is a corrupt object.

The "0_6" object points to the exact IP address as the correct "myserver" object, the only difference is that it shows me a lower DS version.

So the big question is how can i remove the corrupt NCP object when i have to shut down the eDir server (with iManager running) to do that?

  • 0  

    Deleting collision objects is a bad repair approach. From field experience I could estimate that there may be a duplicate tree with 2 identical masters, which replicas are reported as active or down, the NDS sync is interrupted. Caution.

    The collision objects can be examined with the imonitor; the eDirectory Event (Event ID) can be used to determine the status of the object.

    there are inconsistencies in the NDS. This is why collision objects occur in NDS. It takes a lot of experience to solve this. For this, ndstrace is very helpful. If you find references to -672 or -626 or defective transitive vectors or even TimeStamps in Future in the trace, it is possible to carry out repairs. A call to the NDS backliner with a description of the damage is recommended here.

    “You can't teach a person anything, you can only help them to discover it within themselves.” Galileo Galilei

  • 0  

    Those are always signs of a problem, so start with the basic eDir health check to make sure there is nothing obvious like a communications problem or stuck obituaries.

    Make sure those are all dealt with before diving into tackling this directly, solving the problems of one before you get to the other.  On each eDir server run (assuming Linux, any Windows boxes with eDir on it also need their matching checks run.)

     # ndsrepair -T
     # ndsrepair -E
     # ndsrepair -C -Ad -A

    Only once you've cleared all the issues, and getting those running clean, is it really worth the deeper dive Georg for the next stage of clearing them.  Only once those are all totally clear, could we delete the collision object.   If you get stuck clearing any of the issues that you find, then open the case with that corrupt object as the reason for the case so that they dive all the way through and aren't tempted to close the case if you only pointed to an intermediary issue once that is fixed.

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • 0 in reply to   

    I executed all ndsrepair commands and received no error, after that i deleted this 0_6 object on my second iManager server after i shutdowned the master Edir server so it works without problems but i have now other problems during my failed OES 23.4 upgrade.

    All my servers are virtual, before upgrading i made a backup of all servers.

    Accidentially i deleted the backup of my master eDir from before the upgrade attempt (so i have here modified eDir because of the upgrade trys from

    the 23.4 upgrade from all other 4 servers)

    First "problem": DNSDHCP sonsole

    When i log into DNSDHCP console thru the master eDir server, my HO master DsfW server which is also DNS and DHCP server is not showing in the status bar and as you can see at the entry designed primaryserver the name is in edir name written

    And now comes the fun part, when i log into DNSDHCP console thru my BO eDir server which is only replica of the master eDir i see it totally correct!

    I just figured out if i delete the DNS server (which is incorrect showing at the master eDir server) and recreate it, it will be showed correctly again at the master eDir server but this is very weird.

    No we come to the second fun part, if i want to login to my HO DsfW server thru console i get this nice error:

    So login is not possible and after this error message the ndsd service crashed on the DsfW server.

    Clearly i shoot something in my master eDir server or have wrong entrys because of the upgrade and i can beat myself for loosing the "before upgrade" backup file.

    Now i was thinking about to retry the upgrade of my servers from OES 2018 SP3 to 23.4 and home that this problems will be fixed.

    Another plan is to copy the eDir database from my second eDir server (its untouched with state before the upgrade of all my servers)

    to my master eDir server and see whats happen, but i dont know whats the best way to do this.

    Backup and restore with the ndsbackup utiliy?

    Assign my second eDir server to master replica, roll back all server backups and let the old master edir server sync?

  • 0   in reply to 


    There is a bag full of problems in the systems. The first thing to do is to take pen and paper and record and describe each problem individually. Each problem should be isolated and considered individually. For example, problem C can be a consequence of problem A and without fixing A, the error of C will remain. The recording can then be used to identify the first problem that is present here.

    The analysis of a supportconfig is very helpful and can provide an opportunity to create a troubleshooting plan using an ABC analysis and ranking.

    It is difficult to say anything from a distance, but this is how I would proceed:

    1. the NDS has damage, the replica rings are inconsistent, this is the primary flaw. Clean up the NDS

    2. the upgarde of 2018 SP3 leads to errors, errors can be found in supportconfig. There are some notes on the upgrade errors in the forum. From support cases with customers I can say that certificates often become defective during migration, symlinks are missing, services have to be migrated. Common proxy user PWs are missing or have actually not been migrated and much more

    3. worst case. Rollback to OES 2018. I don't know the installation on site, but in worst case DR procedures I often proceed in such a way that I save trustees and other important information beforehand. Then I remove the root partition with the OES installer under a system and import a backup. The process here is quite complex because a directory service returns to a state that is not clear. I ask for your indulgence if I do not describe exactly how I proceed here. It is just a ride on a knife edge and can lead to total loss in case of incorrect knowledge.

    George

    “You can't teach a person anything, you can only help them to discover it within themselves.” Galileo Galilei

  • 0 in reply to   

    I know its a ride on a knife edge at the moment. I just opened a case, may support can help me.

    Shortly decripted the real main problem:

    Rolling back only 3 of 4 VM disks (except the VM disk of the master eDir server) after a failed upgrade to OES 23.4

    So 3 servers are in original state and eDir master works in modified state (he's thinking that the other 3 servers are in OES 23.4. but the are still in OES 2018)

    Bad situation even if the clients don't notice my server problems and the main functions works without issues.

  • 0   in reply to 

    First "problem": DNSDHCP sonsole

    When i log into DNSDHCP console thru the master eDir server, my HO master DsfW server which is also DNS and DHCP server is not showing in the status bar and as you can see at the entry designed primaryserver the name is in edir name written

    And now comes the fun part, when i log into DNSDHCP console thru my BO eDir server which is only replica of the master eDir i see it totally correct!

    The DNSDHCP console does not update your DNS or DHCP server. It only updates eDirectory!

    If it shows different information when pointed to different servers, it is an indication that you have eDirectory replication issues.

    __________
    Kevin Boyle, 
    Knowledge Partner

    Calgary, Alberta, Canada

  • 0 in reply to   

    Yeah, big question is if it works correct, when i copy the eDir database from the "good" server to the faulty server.

  • 0 in reply to 

    That is never a good idea.

    If this server is not a DSfW server I would remove all replicas from this server. Afterwards I would try to get the rest of the replica ring into a good working state. And if the rest of the other servers is working without any problems I would start to readd replicas to this server.

    BUT - if this is a DSfW server you are probably in big troubles, as there are several values, which are not replicated at all on DSfW servers. To repair this with DSfW you absolutely need the help of OT support, by a DSfW senior specialist, because the vaules, which are not replicated on DSfW servers are nowhere documented  - at least not in the accessible documentation for mere OT customers. As you have no reliable backup it can well be, that it is your best solution to remove and permanently shutdown this problem server and to build a new DSfW server with a new name to replace this server and mount the volumes of the problem server on this replacement server. And I would not migrate any service in this case but instead manually recreate them as needed.

  • 0 in reply to 

    No, fortunately this is not a DSfW server, from my two ones DSfW i have a untouched previous (before OES 23.4 upgrade) state, also from the other eDir replica (which must have a 1:1 copy of my master replica eDir).

    On this affected server runs "only" NSS and Groupwise and it is the master replica eDir.

    My plan to solve maybee all the troubles which i have right now is:

    1st stage: bring back all servers from previous state except the faulty one, then desgnate the eDir replica to master replica

    after all this steps shut down this servers/cut the link connection..

    2nd stage: bring back the faulty eDir server up , designate it as read replica, and then boot up all servers again.

    I wondering if this can function...

  • 0   in reply to 

    No.

    Does the following describe the situation properly?

    - you have a garbled tree consisting of 5 servers

    - you can revert 4 of these 5 (all but the master of root) to a point when things were working

    - at least one of these 4 holds a R/W replica of all partitions which currently have the master located on the broken one

    - the broken one does eDir, NSS and GW only (which box holds the CA btw.?)