NITD errors even NSS is NOT AD enabled

I found huge number of errors ( 25000+ )  like this one

Apr 02 00:31:40 server1 smdrd[5729]: [NIT_IPC  0x7f11d9fff700]: ERROR: nitlib_get_userinfo_from_guid: Error response from nitd for getting the userinfo with GUID: 771288b5-0305-44b7-92-8e-b58812770503, error: -9001

TID https://support.microfocus.com/kb/doc.php?id=7024266 shows that this is long known problem, Or back with OES 24.1.1 .

systemctl status novell-nit.service shows that the service is running, why ?

David

Tags:

  • I see those errors since quite a while, especially for service users, for services, which are not OES or SLES services (like e.g. server monitoring software). I think it is novell-cifs related in my case, as all nss-volumes are also available via cifs in my environment, but did not dig further.

  • A very frustrating 'Crying Wolf' set of errors.  It would be nice to get that defect fixed.

    I see this in environments that have never had AD in there at all, never mind all the ones who just haven't found a real need for adding that security risk.

    I figure this is an Idea worthy thing to prod dev to fix, so if you agree, you can vote for NITD errors('Crying Wolf') even without any AD involved

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • I think this is a more severe issue. In the meantime many core services of OES rely on LUM and therefore NIT. These include smdr (tsafs), sfcb (used to manage nss storage) and httpstkd. So I think almost all OES servers with a recent version of the OS have LUM and NIT running.

    The bad thing is, that almost nobody cares for the content of the Unix Workstation object and especially the LUM-enabled groups. I am not sure, what the default settings are, if there are such. Only on a DSfW server you have all users and groups (if you do not maintain a part of Edir, which is not part of the domain) LUM-enabled and therefore almost no nitd errors. But it would be really interesting, which groups should be part of the LUM-enabled groups. E.g.: I found on a server during backups lots of NITD errors asking for groups, which are not LUM-enabled on that server, but have rights to the filesystem. Why has Novell/Microfocus/OT moved such vital services to rely on LUM?

    And to make things worse, the nam.conf file is not working as it should and reduced in functionality in comparison to OES 2015 and before.

    In the old days you could specify port-numbers for the alternative-ldap-server-list, now those entries kills namconfig -k . You can still set them via "namconfig set", but if you run afterwards namconfig -k namconfig cores. This is especially bad in environments with DSfW as you can only use DSfW servers as ldap-servers or no DSfW servers as ldap-servers but cannot mix them, because of the different port numbers.

    I fear, that the resources to get things straight again, will not be available, so that OES is drifting to get more unmaintainable from update to update.

  • it would be really interesting, which groups should be part of the LUM-enabled groups. E.g.: I found on a server during backups lots of NITD errors asking for groups, which are not LUM-enabled on that server, but have rights to the filesystem.

    It is as if there is an assumption that all file access groups will be LUM enabled, even in the classic baseline NCP file sharing system that is the root of OES since it was NetWare. Such a prejudgemental 'logic' so capitalizes the first three letters of assumption.

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • Hi All,

    This has nothing to do with AD.

    Are these errors observed while restoring a back-up non replica OES servers?

    Lokesh

  • These errors are observed during normal backup. As tsands and tsafs are running both and the backup jobs are running parallel, I can not discern, if they are from the tsafs or the tsands daemon. But all servers have replicas.

  • And my clients that I've looked for this are all Veeam type snapshots, so not touching them with TSAs.  Am not seeing the errors on the servers with NSS volumes that users don't touch (GroupWise).  On a lark, added a group right to one of those, and got

    2024-04-04T16:56:29.645782-04:00 ws1 nitd[2362]: [NIT_THRD 0x7fa765bfe700]: ERROR: getlocaluserinfo: No local PWD entry found for the user: Everyone

    but also got that for a group I thought I had LUM enabled.  But then this box doesn't have any replicas, let's see if the 602 errors are still there on this 24.1.1 box later after having ndsrepair -N on it.

    Now to see when there are other errors later on, now that there are groups

    ________________________

    Andy of KonecnyConsulting.ca in Toronto
    Please use the "Like" and/or "Verified Answers" as appropriate as that helps us all.

  • The error "ERROR: getlocaluserinfo: No local PWD entry found for the user: Everyone" should be seen in OES-24.1 and later only. Not to be seen before that.

    GUID lookup errors could be there earlier also. These should be thrown only if the local eDirectory doesn't have the object with the GUID coming as input. That's why the reasoning for local replicas. If the eDirectory does have local replicas for all the users, unless the object is deleted, this error shouldn't be seen.

    For the errors being thrown for GUIDs, can you please try "nitconfig getuserinfo fromguid <GUID>"?

    Will get back to you to deal with this bug a better way.

    Lokesh

  • I can tell you, which errors you see, during a backup with tsafs:

    nitd[2243]: [NIT_THRD 0x7f3f017fe700]: ERROR: getlocaluserinfo: No local PWD entry found for the user: XXX

    and

    nitd[2243]: [NIT_THRD 0x7f3eeb7fe700]: ERROR: getlocaluserinfo: CN not present in the input FQDN: .O=YYY.T=TREE.

    The first type of errors you see for all users and groups, which are not LUM-enabled or if they are LUM-enabled have an alias in another context, that have filesystem rights granted to the volume/filesystem backed up.

    The second type of errors you see for all OUs, Os etc. which have filesystem rights to the volumes backed up. As you cannot LUM-anable OUs every filesystem rights granted via container-objects produce those errors.

    And occasionally you see additionally the following errors:

    smdrd[2140]: [NIT_IPC 0x7f10397fa700]: ERROR: nitlib_get_userinfo_from_guid: Error response from nitd for getting the userinfo with GUID: xxxxxxxxxx, error: -9001

    of course during backups.

    And the errors, which really flood my mesages files are:

    nitd[2243]: [NIT_THRD 0x7f3f01fff700]: ERROR: edir_get_userinfo_from_uid_handler: Failed to fetch FQDN from LUM for UID 480
    nitd[2243]: [NIT_THRD 0x7f3f01fff700]: ERROR: getnamuserfromuid: Error while getting LUM user FQDN for local user: zabbix with UID: 480
    nitd[2243]: [NIT_THRD 0x7f3f01fff700]: ERROR: edir_get_userinfo_from_uid_handler: Failed to get FQDN for local user from /etc/passwd: 2
    nitd[2243]: [NIT_THRD 0x7f3f01fff700]: ERROR: edir_get_userinfo_from_uid_handler: Could not find either local or eDirectory user with UID: 480
    kernel: [1470689.924361][ T1806] nsscomn: Couldn't get user details from NIT for uid = 480, error = -9008

    The user "zabbix", which has on this server UID 480 is a local user for the zabbix daemon, which is used to monitor the server. It is of course included in the /etc/passwd file wth the follwing entry: zabbix:x:480:474:Zabbix Monitoring System:/var/lib/zabbix:/usr/sbin/nologin
    which shows nothing problematic.

    Obviously the daemon wants to get data from the nss volume, which fails, because no data are shown for the nss-volumes. But that is not the problem, the problem is that those errors fill the logs. I could run the agent as root, but as this is not necessary for the function of the agent, I'd like to avoid that.