Highlighted
G_ntherSchwarz Absent Member.
Absent Member.
1015 views

NCS pool stuck in loading state without a node assigned

After some mishap involving a NCS node failing to load a NSS pool
followed by a hard power switch off I am stuck now with one pool in
state "loading" without a node assignment. A simple "cluster offline"
will not work.
Also some of the usual tricks like migrating the master node or forcing
an update of the locally stored configuration with the one in eDirectory
by changing some options in iManager did not help me so far.
So I am looking for a way to rectify the situation without doing an
actual "cluster restart" which in most cases will help as a last resort,
but also causes downtime for all the other pools, of course.

Günther

Labels (1)
0 Likes
4 Replies
Knowledge Partner
Knowledge Partner

Re: NCS pool stuck in loading state without a node assigned

In article <KbwRC.1498$RN1.317@novprvlin0914.provo.novell.com>, Günther
Schwarz wrote:
> After some mishap involving a NCS node failing to load a NSS pool
> followed by a hard power switch off I am stuck now with one pool in
> state "loading" without a node assignment. A simple "cluster offline"
> will not work.


Hi Günther
So that we can better understand, which version are we running here and
how many nodes?
I would try to manually mount that pool to see if that makes a
difference, and to possibly get some better error reporting.
Check the /var/opt/novell/log/ncpserv.log and ncp2nss.log as well as
/var/log/messages for anthing of note when you attempt to mount this
pool.
Make sure your SAN isn't showing any errors, if so fix them first.

You might have to go as far as the deeper verify steps as per
https://www.novell.com/support/kb/doc.php?id=7006457
Note that the verify can take a while, the rebuild certainly takes a
long time with it being like watching grass grow once you get past 99%
done.


Andy of
http://KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please
show your appreciation by clicking on the star below. Thanks!

___
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
G_ntherSchwarz Absent Member.
Absent Member.

Re: NCS pool stuck in loading state without a node assigned

On 06/05/2018 09:28 PM, Andy Konecny wrote:
> In article <KbwRC.1498$RN1.317@novprvlin0914.provo.novell.com>, Günther
> Schwarz wrote:
>> After some mishap involving a NCS node failing to load a NSS pool
>> followed by a hard power switch off I am stuck now with one pool in
>> state "loading" without a node assignment. A simple "cluster offline"
>> will not work.


> So that we can better understand, which version are we running here and
> how many nodes?


These are four nodes running OES2015 SP1.

> I would try to manually mount that pool to see if that makes a
> difference, and to possibly get some better error reporting.
> Check the /var/opt/novell/log/ncpserv.log and ncp2nss.log as well as
> /var/log/messages for anthing of note when you attempt to mount this
> pool.
> Make sure your SAN isn't showing any errors, if so fix them first.


Actually the NSS part looks just fine. I can even run the cluster load
script on a command line. The pool and volume will come online and the
secondary IP address is configured.

> You might have to go as far as the deeper verify steps as per
> https://www.novell.com/support/kb/doc.php?id=7006457
> Note that the verify can take a while, the rebuild certainly takes a
> long time with it being like watching grass grow once you get past 99%
> done.


It is a small pool, so a verify command will not take long time. But
then my problem does not seem to be related to NSS. For me this looks
just like NCS being stuck. So I am looking for a way to reset this
single ressource without doing a cluster restart. I might just delete it
and create it again.

Günther


0 Likes
Knowledge Partner
Knowledge Partner

Re: NCS pool stuck in loading state without a node assigned

In article <D%LRC.1522$RN1.526@novprvlin0914.provo.novell.com>, Günther
Schwarz wrote:
> Actually the NSS part looks just fine. I can even run the cluster load
> script on a command line. The pool and volume will come online and the
> secondary IP address is configured.

...
> It is a small pool, so a verify command will not take long time. But
> then my problem does not seem to be related to NSS. For me this looks
> just like NCS being stuck. So I am looking for a way to reset this
> single ressource without doing a cluster restart. I might just delete it
> and create it again.


Ah, so at least you aren't down and out with that resource.
Perhaps there is an eDir sync issue, run though the basic check on all
those cluster nodes and any others that hold those objects, the usual
ndsrepair -T
ndsrepair -E
ndsrepair -C -Ad -A
making sure no errors, that nothing simple has snuck by you.
Make some other change(s) to the cluster resource object to force some
syncing of it. Perhaps looking at the object from different iManager
instances.


Andy of
http://KonecnyConsulting.ca in Toronto
Knowledge Partner
http://forums.novell.com/member.php/75037-konecnya
If you find a post helpful and are logged in the Web interface, please show
your appreciation by clicking on the star below. Thanks!

___
Andy of Konecny Consulting in Toronto
Knowledge Partner Profile
If you find a post helpful, click the Like button below. Thanks!
0 Likes
G_ntherSchwarz Absent Member.
Absent Member.

Re: NCS pool stuck in loading state without a node assigned

On 06/07/2018 05:03 AM, Andy Konecny wrote:

> Ah, so at least you aren't down and out with that resource.
> Perhaps there is an eDir sync issue, run though the basic check on all
> those cluster nodes and any others that hold those objects, the usual
> ndsrepair -T
> ndsrepair -E
> ndsrepair -C -Ad -A
> making sure no errors, that nothing simple has snuck by you.
> Make some other change(s) to the cluster resource object to force some
> syncing of it. Perhaps looking at the object from different iManager
> instances.


Thank you very much for your suggestions. Finally a cluster restart
fixed the issue. The pool is online again and I can migrate it from node
to another without problems. I am missing a reset command for single
cluster resources.

Günther


0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.