Highlighted
Tarasus Absent Member.
Absent Member.
1353 views

NetWare 6.5 NSS volume disactivates

Good day for You, venerable Novell society!
You are my last Hope to solve existing and growing problem.
So, maybe it'll be a lot of words, but I have to tell You all the things one after another using my broken English (by the way, I'm sorry for that in advance).
Story starts a few years ago, when there was a need to replace one of the hard discs from working NW 6.5 for a newer and bigger one, because HDD starts to fail. I had a SUN disc shelf as a donor and I took the first likest HDD for replacing the death-marked patient. I did'nt know how to migrate data from one volume to another and in view of tha fact that it is the system disc from wnich system boots, I decided try to clone this, maybe with resizing an existing NSS partition, because replaceable disc was 36 Gb capacity, and "the new one" has a 72gb. Using Acronis True Image soft, I tried to clone an antry disc from the smaller to the bigger manually, but the soft says, that it cannot resize this type of partition and at the same time it is ready to clone the whole disc to another "byte-in-byte" with the rest piece of free space. I agreed with the soft and successfully cloned a disc, replaced tha old, booted from "new" disc without any issues and after that using Novell Remote Manager right from server graphic console, increased the volumes SYS and the another one named "STORAGE", using free space the "new" disc.
This construction successfully worked about 5-6 years up to now.
Troubles comes together with a new security policy in our organization: all workstations now have to equipped with some soft, that allows to work with global databases via cryptotunnels. That soft (nobody knows, including authors, why???) writes file SMART_IO.CRD to all volumes, that it recognizes as fast - it may be flash drive, network drive, etc. This soft permanently writes something into these files and reads them. And periodically happens that one of the workstations holds this file for itself purposes and a rest of connected users have to wait while the workstation, that blocks the file, freed it. So far it occurs, all users are veeeeeery slow: folders opens 20-30 secs, access to files tooks 1-1.5 minutes, opening "regular" .doc file becomes an event on five and more minutes.
First times we just rebooted server. But file locking comes very quick - after 1-2 hours of work. So, the next solution (more effective) was found in renaming locked file to something another: station, that holds a file, keeps working with it, but another stations immediately creates "new" SMART_IO.CRD and operates with it further with happyness.
Maybe I've fount a solution, but doubts in my mind hounted me more and more: the server, that is developed to server 1000+ users with average loading (if the "load monitor" screen did'nt lie to me) 1.5%, with scsi (ok, not young, but so what?) discs and volumes, spreaded on these discs more or less evently - why, WHY this system holds on locking ONE STUPID file, that is used by maybe 100 users???
Answer keeps itself waiting not very long time: in a couple monts "renaming" like this I received a new surprise from te server: Volume "STORAGE" deactivated due hardware error. And at the console I'm watching at the message, that the one of my discs have a trouble and needs to be replaced.
Rebooting server did'nt show anything criminal: server booted, volumes mounted, users worked.
O.k. I understood a task: I need another HDD.
And while I searched the drive, I head a solution for replacement: do like I did in the past - clone disc, then increase volume.
O.k. I sayd - I did. Booting from Acronis bootcd, starting cloning... Oooops. Read sector error occures. Retry-retry-retry - do'nt helps. O.k. Trying to read data from volume. Booting server normalle, but without X, trying to read. After reading for 10-15 minutes error message occurs: volume is deactivated, harddisk wants to be a garbage fastly. Reboot. Reading some files. Error. And mane and many times like this for two days. Users says, that they took most critical files and now I can do something awful with the server, and I did. I tried to recover bad sectors on the disc useng SCSI utility. It found many, many errors and reports me, that reassigned all bad sectors.
- Very good! - I says myself, - now time to start cloning!
Cloning brings many retries, but only one sector I had to skip - 100+ retries did'nt help me to read it. But cloning "ends successfully" and I, removed a disk and selected to boot from a "new one", waited and hope to the Miracle. Miracle, on some reasons, did not happen: at the booting procedure server says, that it cannot mount volume SYS, because something corrupted, or it is one or more segment not found.
I tried to cure SYS via nss menu. But it is very interesting: all documents tells me, that when I start nss menu, I have several options: pools, volumes, RAID, etc, and REBUILD. I have not option REBUILD neither when I'm typing nss menu, nor nssmnu. In the menus I do'nt see, why volume is broken - it is nothing related with SYS in any submenus: only "unused space" or volume STORAGE belonging.
I've booted from the sick disc again. It boots and mounts all the volumes. I'm reading more data, but trouble, of course, still present.
I did'nt found any info how to connect or mount an existing partition (nss menu shows me, that "here is some unknown partition, than is not belonging anybody, but have the same size with your lost SYS volume") and I decided to create at the "new disc" the SYS partition. But I did'nt create a clean new partition, I wanted to show my server, that "there is a few free space, where you can place unfound part of SYS volume" and I've just expanded a SYS volume to some unused space on my fresh-cloned disc. I know, it was a great mistake.
Pool for volume SYS becomes a two times bigger.
How to remove a part from volume back - I do'nt know. How to move a part from one physically disc to another - I do'nt know too.
I thought, that situation with unmountable SYS happens, because one sector from source disc could'nt be read and I decided to check disc surface and try to clone it again.
Checking surface shows me a lot of errors. And again reports, that all of them age remapped. For now cloning I took another disk - the same size with the "sick" and from the same disc shelf, but a little different model. Cloning asked for many retries again, but now it was no any unreadable sectors, that I had to skip.
Shutdown.
Replacing "old" drive.
Booting.
Seems o.k., but....
Where is the STORAGE volume???
nss mnu shows me, that there are some parts of SYS on several discs and some unknown partitions, thet are not belonging to any volume.
So. When I'm booting from the first cloned disc, result is: successfully mounted volume STORAGE, but no volyme SYS.
When I'm booting from the second cloned disc - I have SYS, but no STORAGE.
But it is not the awful thing!
Now when I'm trying to boot from the sick disc, I'm having the same trouble with SYS too: it does'nt want to assembly to a correct volume and mount!!!

Dear sirs!
Dear collegues!
Dear enthusiasts, who are faithful Netare!
Is it any way to solve my problem? Maybe I can somehow get access to data on the STORAGE volume? Maybe I somehow can repair an volume, exclude from pool a part, that is placed on the bad disc? Maybe just one good idea or long, hard,but fortunable, way?
Thank You a lot even for reading my long story...
Labels (2)
0 Likes
7 Replies
Knowledge Partner
Knowledge Partner

Re: NetWare 6.5 NSS volume disactivates

Tarasus wrote:

> You are my last Hope to solve existing and growing problem.


Hi Tarasus

Welcome to the Micro Focus forums.

I'm sorry you are having so many issues. There are still a few of us
who monitor these forums that have NetWare experience. I'm sure others
will also have some suggestions to offer you.

As you probably know, NetWare 6.5 is very old and unsupported. Is it at
least up to date with service packs? The last service pack is SP8 and
there have been a few NSS patches released after that.

I have read your post several times. You appear to have two issues:

1. File locking issues related to the SMART_IO.CRD file.
2. Data corruption on your hard drive.


Let's look at issue #2 first.

There are several things we can do to reduce data loss:

1. Hard drives will fail. To protect against a drive failure multiple
drives are configured in a RAID array but it appears you didn't do that.

2. We make regular backups, just in case we lose our data. Do you have
a current backup?

3. When we have a drive failure we immediately stop using it and back
it up. Cloning the drive is a good choice. Even better, make two copies.


When there are disk errors, the cloned disk may also have disk errors
and trying to fix those errors may just create more errors. That is why
we make two cloned copies. If you don't still have a clone of your
original disk that you haven't tried to repair, I would immediately
make two new cloned copies because it appears that the clone what you
have tried to fix may be beyond repair. 😞

You know you have disk errors but you don't know what caused them. The
disk(s) may be defective but your server may also have problems. I
would suggest using a new(er) computer that you know is working
properly to make your cloned copies and attempt repairs.

Sometimes using a different cloning tool can produce a better clone. My
preference is to use the Linux "dd" command.

Once you have cloned the disk, put the original away in a safe place in
case you need it again. Next, you would want to check the disk for
errors by doing a sector by sector scan of the whole disk.

Once you know there are no drive errors, you need to check for
filesystem errors before attempting to use the drive. To check for
errors on your SYS and SRTORAGE volumes you need NSS utilities. That
means using another working NetWare system or, better yet, an new Open
Enterprise Server (OES) system. If you don't have OES, you can download
a trial version and use it to make repairs.
https://www.microfocus.com/products/open-enterprise-server/?utm_medium=301&utm_source=novell.com

The reason I recommend OES is that it contains the most recent version
of all the NSS utilities and automatically recognises NetWare volumes.
If you don't have any Linux experience there is a bit of a learning
curve. Alternatively, I would use another fully patched NetWare 6.5 SP8
server.

Once you again have a working server we can look at your other issue.



--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below this post.
Thank you.
_____
Kevin Boyle - Knowledge Partner - Calgary, Alberta, Canada
Who are the Knowledge Partners?
If you appreciate my comments, please click the Like button.
If I have resolved your issue, please click the Accept as Solution button.
Tarasus Absent Member.
Absent Member.

Re: NetWare 6.5 NSS volume disactivates

Hello, Kevin, good day for You!
In the very first line of my answer let me say "Thank You very much" for Your answer. I know, that the product is very old and lately discontinued, but I hoped that ehe are aven some specialists, who remember how to cook NW 🙂
Yes, I have now a good backup of userdata, that is stored on the STORAGE volume.
I could'nt wait without doing something, and does some manipulations with the drives. I tried to boot again from any of discs, that contains SYS volume: the original drive (that is failing), the 1st cloned drive (Hitacho 146Gb) and the second cloned drive - Fujitsu MAW.
I as though heard Your advice to take another server: I've took the server with the same config, but another chasis (I have three identical servers and one of them works with the task, that can wait a day or two) and "mixed" drives there.
So... At the finish I've booted the first, failing drive - only that schema shows me mountable STORAGE volume, but SYS with errors. But when booting I gave an options -ns to server.exe and starts without autoexec. Without any hope I typed "nss /poolverify=SYS" (that command already shows me a message, that named pool is not exist, but in happend AFTER the error and the server deactivates volume and pool) - and I see the verify pool screen. When the "pool verify" ends it shows, that there are some errors and many warnings. And after that I tried to nss /poolrebuild=SYS, and it worked too! Proram checks the pool, fixes errors and after that I could boot normally and finished to copy the rest data.
At the first minutes from "successfull" boot I tried to backup data anywhere else. But I do'nt know why server did'nt want to operate with the network: interfaces are UP, link is UP, NLMs are loaded - but even load ping was not able to communicate with anything.
Next reboot solves the problem itself: network is enabled and functional.

About dd...
I have a little expierence in *nix systems, and I only used dd to sector-by-sector copy of the whole partitions (as row devices like /dev/sda1 or /dev/ad0s1a), but I do'nt know how names an NSS volume and where can I restore the image, I made.

Idea about trial OES is very interesting... I did'nt think in this direction. Have to try, thank You very much again!

So... Now the data is safe, I only need an advice how to exclude a piece of volume pool SYS from damaged disc. I can (and I think, I will) made the third try to clone. Maybe I'll take another drive (I have 10 HDDS for experiments), maybe I'll rewrite Fujitsu MAW (this try to clone bocame most useless). I'll try after the cloning to rebuild SYS again, but at the finish I'll have the system with 4 discs: 2 Seagatex36Gb (piece of SYS, piece of STORAGE), 1 Fujitsu MAWx72Gb (replaced failing disc) - a piece of SYS, bootable DOS partition, the biggest part of STORAGE volume and 1 Hitachix146Gb, that contains clone of DOS partition, piece of SYS (that I've unsuccessfully added while tried to migrate the volume), a part of STORAGE and free space. Interesting, that when I'm running "nss menu", now I see that I have a mirror, but I have not that. Maybe system somehow "sees" after cloning, that the volumes are identical and recognizes them as submirrors? Mirror, as system says me, made from an antry of one of Seagates and a partition at the Hitachi...
Now, when the data is safe, I want to put system in that config: 164Gb Hitachi contains a DOS bootable partition, a part of SYS and a part of STORAGE. Plus two Seagates: one contains a part of SYS and the part of STORAGE, the second - fully is the part of STORAGE.
In principle, now I can rebild STORAGE freely, maybe even I can re-create the volume, but how can I reconfigure the SYS?
Thank You for Your attention and answer.
0 Likes
Knowledge Partner
Knowledge Partner

Re: NetWare 6.5 NSS volume disactivates

Tarasus wrote:

Hi Tarasus,

What is the latest service pack applied to your NetWare system?

> Yes, I have now a good backup of userdata, that is stored on the
> STORAGE volume.


If you have problems with your disk and/or the STORAGE volume, that is
not a good place to store your backup. 😞


> About dd...
> I have a little expierence in *nix systems, and I only used dd to
> sector-by-sector copy of the whole partitions (as row devices like
> /dev/sda1 or /dev/ad0s1a),


That is how I use it too.


> but I do'nt know how names an NSS volume
> and where can I restore the image, I made.


If you copied a whole drive or partition using "dd", the NSS data is
copied because it is within the partition. You cannot just restore the
NSS volume. What you can do is install the drive in a NetWare or OES
server and you should be able to mount the NSS volumes. Be aware,
because this is a cloned drive, pool names and volume names will be the
same as the original ones. You don't want to have duplicate pool names
or volume names on the same server as that can cause problems.


> So... Now the data is safe, I only need an advice how to exclude a
> piece of volume pool SYS from damaged disc.


I'm don't understand what you mean. If your disk is damaged, you should
not be using it. From what you say, it appears that your SYS volume may
be corrupted.


> Interesting, that when I'm running "nss
> menu", now I see that I have a mirror, but I have not that. Maybe
> system somehow "sees" after cloning, that the volumes are identical
> and recognizes them as submirrors?


Duplicate names are not permitted and are likely responsible for this
behaviour.


> In principle, now I can rebild STORAGE freely, maybe even I can
> re-create the volume, but how can I reconfigure the SYS?
> Thank You for Your attention and answer.


Can you reinstall NetWare on a new disk, make a new SYS volume, apply
the latest SP8 patch, and test to make sure you have a working NetWare
server? Once you have a working system, install your cloned disk that
contains your NSS data and mount your STORAGE volume.


--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below this post.
Thank you.
_____
Kevin Boyle - Knowledge Partner - Calgary, Alberta, Canada
Who are the Knowledge Partners?
If you appreciate my comments, please click the Like button.
If I have resolved your issue, please click the Accept as Solution button.
0 Likes
Tarasus Absent Member.
Absent Member.

Re: NetWare 6.5 NSS volume disactivates

KBOYLE;2469063 wrote:


What is the latest service pack applied to your NetWare system?



I have Sp8 now. When I installed system it was SP7, and Sp8 I've rolled up a bit later. I have an iso image with SP7 embedded and now looking for where is SP8 stored in my directories....

KBOYLE;2469063 wrote:

If you have problems with your disk and/or the STORAGE volume, that is
not a good place to store your backup. 😞


I not so said as I wanted to say. 🙂 I meant that I already have a copy of all data from the STORAGE and it now stored at the NAS

KBOYLE;2469063 wrote:


That is how I use it too.


Yesterday I booted from Gentoo (it is just the first random bootable Linux I've tooked from the shelf) and tried to dd. It copies 200 Mb and halts due I\O error. Maybe my mistake was "bs=4096", that I usually set when dd'ing something via nfs. Now I'm trying to dd it again with bs=1024. "Two hours - flight is normal."

KBOYLE;2469063 wrote:


If you copied a whole drive or partition using "dd", the NSS data is
copied because it is within the partition. You cannot just restore the
NSS volume. What you can do is install the drive in a NetWare or OES
server and you should be able to mount the NSS volumes. Be aware,
because this is a cloned drive, pool names and volume names will be the
same as the original ones. You don't want to have duplicate pool names
or volume names on the same server as that can cause problems.


I have NSS volume "spreaded" on three physical discs. And, logically, I have to plug to a "new system" all of them to see the correct volume?

KBOYLE;2469063 wrote:


I'm don't understand what you mean. If your disk is damaged, you should
not be using it. From what you say, it appears that your SYS volume may
be corrupted.



Yes, SYS volume was at the corrupted disc. Actually, a part of SYS is at the failing disc, and another part is at the different disc. When I tried to create SYS at the first cloned disc (which boots with only STORAGE, but did'nt know anything about SYS), I made a mistake - I just extended SYS pool to this HDD and now do'nt know how to take it back. And now volume SYS lays on three discs: one damaged disc (Fujitsu MAT), one of old, but healthy, discs (Seagate), and at the first cloned disc (Hitachi).
If I try to boot from failing disc (but I do'nt want to do that) without Hitachi - of cause SYS could'nt mount, but if I'll try to rebuild, will it "forget" about piece, that lays on the Hitachi? I'm sure that there is no actual data, just only free space. SYS weighs about 2 Gigs, but the pool now is 17 Gb.

KBOYLE;2469063 wrote:


Duplicate names are not permitted and are likely responsible for this
behaviour.


Ah hah. It's clear.

KBOYLE;2469063 wrote:


Can you reinstall NetWare on a new disk, make a new SYS volume, apply
the latest SP8 patch, and test to make sure you have a working NetWare
server? Once you have a working system, install your cloned disk that
contains your NSS data and mount your STORAGE volume.


A'll try today. Will write here a result.
Thank You very much again.
0 Likes
Knowledge Partner
Knowledge Partner

Re: NetWare 6.5 NSS volume disactivates

Tarasus wrote:

> I have NSS volume "spreaded" on three physical discs. And, logically,
> I have to plug to a "new system" all of them to see the correct
> volume?


A Volume is created in a Pool. A pool provides storage for volumes. The
storage is obtained from one or more disks (hard drives) or partitions.

If you have a pool that uses storage from three disks then you need all
three disks to access files stored on any volumes created in that pool.
You have to make sure none of the disks have any physical errors.

If you have filesystem errors you can do a pool verify to check for
problems.



--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below this post.
Thank you.
_____
Kevin Boyle - Knowledge Partner - Calgary, Alberta, Canada
Who are the Knowledge Partners?
If you appreciate my comments, please click the Like button.
If I have resolved your issue, please click the Accept as Solution button.
0 Likes
Tarasus Absent Member.
Absent Member.

Re: NetWare 6.5 NSS volume disactivates

Good afternoon everybody and personally honorable KBOYLE!

After our last conversation I've tried to "dd" broken disc once again with setting of bs=1024. It reads two days, but the image file was written successfully, without any errors. I took another 72G HDD (Fujitsu MAW) and re-write it within this "fresh" image.
The server booted normally, mirrored volume synchronizes and everything looks good.
Now I have four discs inserted it the 5-disc tray. I can boot from Hitachi or from Fujitsu MAW - at the finish of loading I have working server. Just mirrored volume re-sync every time.
I made something like "stress test": parallely read and write over 500 000 files sized from 40 bytes to 1.2kb by wteh whole night - server is still working.
I received an error on the SYS volume, but poolrebuild cures them and poolverify says that everything is ok.

So, it seems, that the problem "pool deactivates" is more or less solved with Your priceless help, can we discuss about the next here, or I have to make a new thread?
0 Likes
Knowledge Partner
Knowledge Partner

Re: NetWare 6.5 NSS volume disactivates

Tarasus wrote:

> So, it seems, that the problem "pool deactivates" is more or less
> solved with Your priceless help, can we discuss about the next here,
> or I have to make a new thread?


Hi Tarasus,

I'm happy to hear that you have solved this problem. I'm glad I could
help.

If you have a new issue or question, you should start a new thread. It
is easier to keep track of things this way.



--
Kevin Boyle - Knowledge Partner
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below this post.
Thank you.
_____
Kevin Boyle - Knowledge Partner - Calgary, Alberta, Canada
Who are the Knowledge Partners?
If you appreciate my comments, please click the Like button.
If I have resolved your issue, please click the Accept as Solution button.
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.