Highlighted
Super Contributor.
Super Contributor.
573 views

OES 2018 - 64 bit Pool/NSS volume 'unassigned' in cluster when volume Cluster resource 'unassigned'

Jump to solution


OES 2018 - 64 bit Pool/NSS volume 'unassigned' when bringing the cluster resource online
- this is a brand new 64 bit Pool/NSS volume
- the other 3 Pool/NSS volumes are 32 bit and cluster mount
- the cluster resourse does have the 2 cluster servers assigned to it.

Bringing online by manual load script, works.

bringing cluster resource online - this shows in messages

2020-06-29T17:18:03.146676-06:00 nss-file02 kernel: [355017.314125] POOL_REVISION_CHECK (NCS_CRM_RES_T03:0:13): Cannot online resource 'DP5' on node 'nss-file01', because NSS on the node may not understand the newer NSS media associated with the resource.

---
Cluster load script.
--
#!/bin/bash
# DP5 Load script

. /opt/novell/ncs/lib/ncsfuncs
exit_on_error nss /poolact=DP5
exit_on_error ncpcon mount DATA5=245
exit_on_error add_secondary_ipaddress 192.168.12.215
exit_on_error ncpcon bind --ncpservername=DP5 --ipaddress=192.168.12.215
exit_on_error novcifs --add '--vserver=".cn=DP5.ou=Novell.dc=nss.dc=local.t=NSS_TREE."' --ip-addr=192.168.12.215
exit 0

--
nss-file02:~ # cluster online dp5

Status for Resource: DP5
Unassigned Lives: 0
Revision: 6
nss-file02:~ # cluster status
Master_IP_Address_Resource Running nss-file01 1
DP2 Running nss-file01 1
DP3 Running nss-file01 1
DP4 Offline 0
DP5 Unassigned 0
---
manual load script
- this script activates the pool & volume & set a CIFS name & IP - allthis and brings the Pool/NSS online
--
nss-file02:~ # cat dp5load.sh
/opt/novell/ncs/lib/ncsfuncs
nss /poolact=DP5
ncpcon mount DATA5=245
# add_secondary_ipaddress 192.168.12.215
ip addr add 192.168.12.215/24 dev eth0
ncpcon bind --ncpservername=DP5 --ipaddress=192.168.12.215
novcifs --add '--vserver=".cn=DP5.ou=Novell.dc=nss.dc=local.t=NSS_TREE."' --ip-addr=192.168.12.215
--

following the link
https://www.novell.com/documentation/open-enterprise-server-2018/clus_admin_lx/data/bxxeft8.html
- this describes my issue, but I have none of the problems listed.

Ayny suggestions, comments, guesses.

 

hello
Tags (2)
0 Likes
1 Solution

Accepted Solutions
Highlighted
Knowledge Partner
Knowledge Partner

Think you've accepted the wrong post as solution 🙂

Anyway, i THINK (based on the behaviour we've seen) that the code to determine the node's (not the pool's) NSS "ability" does not run at all if there's no SBD partition (i'd guess it says "SBD is mandatory for clustered fileservices, a one-node FS cluster doesn't make sense, so there's no need to run it"). Hence, it's simply "0" instead of "14" (as it should be on OES2018SP2). Now when the node's and pool's capabilities are compared the results are "0" vs. "0" for a legacy pool (which is ok and allows activation) and "0" vs. "13" for a "latest & greatest" pool which blocks activation via NCS methods. If you try to activate a pool "manually" this "NCS check" is not made, that's why it worked from your script.

 

If you like it: like it.

View solution in original post

35 Replies
Highlighted
Knowledge Partner
Knowledge Partner

Is this really OES2018FCS (i.e. no SP)?

Which pool and volume layout do you see in nsscon (this is reported at activation time)?

 

If you like it: like it.
0 Likes
Highlighted
Super Contributor.
Super Contributor.
OES version is OES2018SP2
made the 64 bit Pool/NSS under OeS2018SP1

As for nsscon
- for the esixting 32 bit Pool/NSS volume
Activating pool "DP2"...
** Pool layout v43.03
** Processing journal ** 1 uncommitted transaction(s)

Activating volume "DATA2"...
** Volume layout v38.05
** Volume creation layout v36.03
** Processing volume purge log
Setting Zid mode to zid32 on DATA2

- for the new 64 bit Pool/NSS volume
- Manual mounting with manual script or nssmu
Activating pool "DP4"...
** Pool layout v52.01
** Previous clean shutdown detected (consistency check OK)
** Loading system objects
** Processing volume purge log
** .
** Processing pool purge log
** .
Loading volume "DATA4"
Volume DATA4 set to the DEACTIVE state.
Pool DP4 set to the ACTIVE state.
Activating volume "DATA4"...
** Volume layout v41.01
** Volume creation layout v41.01
** Processing volume purge log
Setting Zid mode to zid32 on DATA4

Doe this help??
hello
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Can't see anything wrong from the output. The one thing which makes me wonder is that the volume comes up in zid32 mode where (to my knowledge) it should be zid64, provided that all nodes are running OES2018 or later.

Are all nodes at 2018SP2 (not just the ones which have the resource assigned)?

Did this ever work before upgrading to SP2 (i.e. at the time you've created the pool and its volumes)?

 

If you like it: like it.
0 Likes
Highlighted
Super Contributor.
Super Contributor.
All nodes in this cluster are OES2018SP2 and All nodes in the tree are OES2018SP2

mounting the Pool/NSS volume never did work with a cluster resource script, but will load manually via the manual script or nssmu.

The Pool/NSS volume was made with OES2018SP1. And at that time, nssmu crated a 64bit Pool/NSS volume.
Cluster resource was made, but could never mount it with the cluster resource script.
hello
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Are both DP4 and DP5 affected?

If you like it: like it.
0 Likes
Highlighted
Super Contributor.
Super Contributor.
YES
hello
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Was just asking 'cause in the OP you've mentioned that "the other 3 Pool/NSS volumes are 32 bit and cluster mount" and there seems to be a total of four pools. Anyway, while in this offset a volume in a 64bit pool should activate in zid64 mode it's likely unrelated as you get the error on activating the pool, it wouldn't need a volume at all to get this done.

If both DP4 and DP5 were created on the same codebase, do you have a little diskspace on the shared storage to create a tiny resource with nssmu?

 

If you like it: like it.
0 Likes
Highlighted
Super Contributor.
Super Contributor.
I can delete DP5 Pool/NSS volume & then dd the disk to clear the partition info. This will make a clean disk to test with.

What are you thinking? What process/procedure do you think will create a proper 64 bit Pool/NSS volume?
hello
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Well, we've seen this error in the past now and then, but if it's been a bug it has always been related to pools created a long time ago (NetWare e.g.). In these cases the code simply misinterpreted the pool version as "too new for me". But you're facing issues with pretty fresh pools, and you were facing them even on the codebase the pools have been created on, which is something i've never seen before.

So maybe something went wrong on creation (bug in the libraries or with iManager, if it's been used). Hence, if time permits, i'd try with current code as i'd rather doubt that such an error would have been carried over for such a long time. A SR would be an option, too, but it might be faster to do the check first. And nssmu would be the tool of choice.

 

 

If you like it: like it.
0 Likes
Highlighted
Super Contributor.
Super Contributor.
- deleted the old DP5 pool
- init the device
- created a new pool DP6 - only used 60G of 600G
- pool creation complained that not all the servers in this cluster are at OES2018SP2
- both servers /etc/novell-release show
---
Open Enterprise Server 2018 (x86_64)
VERSION = 2018.2
PATCHLEVEL = 2
---- these were upgraded from OES2018SP1 early this month.
continued
- filled out the pool creation info for a cluster resource
- created a new volume DATA6
- deactivate volume and pool
cluster status showed the new resource
cluster online dp6_server
- still unassigned
cluster offline dp6_server

nsscon
--Activating pool "DP6"... ** Pool layout v52.01
** Previous clean shutdown detected (consistency check OK)
** Loading system objects
** Processing volume purge log
** .
** Processing pool purge log
** .
Pool DP6 set to the ACTIVE state.
Loading volume "DATA6"
Volume DATA6 set to the DEACTIVE state.
Activating volume "DATA6"...
** Volume layout v41.01
** Volume creation layout v41.01
** Processing volume purge log
** .
Setting Zid mode to zid32 on DATA6
Deactivating volume "DATA6"...
Deactivating pool "DP6"...
Pool DP6 set to the DEACTIVE state.
---
looks like it still a zid32 ???

What next??




hello
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

That's really a strange one. At least we've seen that the box complains already at pool creation time (do you remember the exact message, btw.?). I have no clue how it comes to this conclusion. I've setup a little lab based on up to date OES2018SP2 and tried to dupe the behaviour by tampering with some textfiles (novell-release, stuff in /etc/products.d and /etc/sysconfig/novell) and attributes (such as "version", "DS version" and "NCS:Revision"). I've deleted stuff, set it to arbitrary values, nothing kept the cluster from activating and mounting (in zid64 mode btw.).

I've also created a "fake node" which indeed gets listed with the fake nodeid i've assigned to it, but stuff keeps on working. Nevertheless it won't harm to check /var/opt/novell/ncs/nodes.xml and gipc.conf, comparing nodes and nodeids with what eDirectory has to offer. Apparently "NCS Revision" on the cluster object should be "282", but it's been used for other stuff in the past anyway (and i've set it to "0", "9999" and deleted it without consequences). "nss version" states "4.21d Build 6960", this should also get reflected in the xml files located in "/_admin/Manage_NSS/Module". Years ago i've seen an AV product locking stuff there which caused all sorts of problems.

Other than that i'd by now recommend to open a SR, as the issue should be not too hard to resolve once we know how the code tries to identify the OS. You might want to reference Bug#1172034 which is pretty new and describes a similar issue (while regarding to pools which used to work before). Especially mention the message you got yesterday on creating the pool with nssmu.

 

 

If you like it: like it.
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.