mark-v

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-08-29
01:03
3250 views
NSS Pool deactivating
Hi,
I have started having a few issues with some of our Netware servers 6.5SP8 with Pools de activating.
Unfortunately these servers have only one pool with all the data volumes & SYS in it and are around 650GB in size.
I can't remeber exactly the errors on screen but after rebooting I ran a verify (which took around 8 hours) which came up saying no errors, but "WARNINGS
Unaccounted Blocks Exist."
I am not sure what this means. About 5 weeks after this it happened again. After a reboot all is fine but now I am wondering if I should do a full rebuild?
Current NSS verion is N65NSS8a. I have the next version, N65NSS8c here which I will apply tonight, should I leave it at that to see what it does or do a rebuild after that as well?
This is a standard file & print server, runs GWise, plus a few other services like DHCP, printing etc. I know all the warnings about there being a good chance of files being lost etc but just how much of a chance is there?
Especially in the GWise side - this is the server in HO and the GW PO is around 200GB so I am a bit hessitant.
This is from the VLF log - I am not sure what to interpret from it..
Any ideas folks?
I am pretty confident it is not a hardware issue but, who knows.
Thanks - Mark
***************************
21 Jul 2012 11:57:28
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
ERRORS
NONE.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
WARNINGS
Unaccounted Blocks Exist.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
Pool: AQ Total Size: 697864 Meg (178653696 blocks)
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
------- Pool Scan Report -------
Crosslinked Blocks.................0
Unaccounted Blocks...............221
Free Tree Blocks.............(50)599
Object Tree Blocks.................4
Name Tree Blocks...................1
Journal Blocks................528854
Purge Log Blocks...................9
Super Blocks......................64
Pool Info Blocks...................2
Other Tree Blocks..............22671
Used By Pool..................552256
Total In Use...............151812375
Unused Blocks...............26841321
Total Blocks...............178653696
Highest LSN.......0x00000005BB492FBF
Lowest LSN........0x0000000000000000
Object Tree Entries................6
Object Special Entries.............7
Salvage Tree Entries.........1110891
------ Pool System Report ------
Total Blocks................178653696
Blocks in Use...............151812375
Purgeable Blocks.............14567746
Non-Purgeable blocks................0
Pool Info Blocks....................2
----- Salvage System Report ----
Salvage Entries W/O IDs.............0
Salvage Entries W/O names...........0
Salvage Parents W/O IDs.............0
Salvage Parents W/O names...........0
Salvagable Objects..................0
Salvagable Blocks............14567746
Object Tree Levels..................2
Name Tree Levels....................1
Salvage Tree Levels.................4
User Rest. Tree Levels..............1
MFL Tree Levels.....................0
Logical Vols NOT Verified...........0
I have started having a few issues with some of our Netware servers 6.5SP8 with Pools de activating.
Unfortunately these servers have only one pool with all the data volumes & SYS in it and are around 650GB in size.
I can't remeber exactly the errors on screen but after rebooting I ran a verify (which took around 8 hours) which came up saying no errors, but "WARNINGS
Unaccounted Blocks Exist."
I am not sure what this means. About 5 weeks after this it happened again. After a reboot all is fine but now I am wondering if I should do a full rebuild?
Current NSS verion is N65NSS8a. I have the next version, N65NSS8c here which I will apply tonight, should I leave it at that to see what it does or do a rebuild after that as well?
This is a standard file & print server, runs GWise, plus a few other services like DHCP, printing etc. I know all the warnings about there being a good chance of files being lost etc but just how much of a chance is there?
Especially in the GWise side - this is the server in HO and the GW PO is around 200GB so I am a bit hessitant.
This is from the VLF log - I am not sure what to interpret from it..
Any ideas folks?
I am pretty confident it is not a hardware issue but, who knows.
Thanks - Mark
***************************
21 Jul 2012 11:57:28
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
ERRORS
NONE.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
WARNINGS
Unaccounted Blocks Exist.
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
Pool: AQ Total Size: 697864 Meg (178653696 blocks)
ÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ
------- Pool Scan Report -------
Crosslinked Blocks.................0
Unaccounted Blocks...............221
Free Tree Blocks.............(50)599
Object Tree Blocks.................4
Name Tree Blocks...................1
Journal Blocks................528854
Purge Log Blocks...................9
Super Blocks......................64
Pool Info Blocks...................2
Other Tree Blocks..............22671
Used By Pool..................552256
Total In Use...............151812375
Unused Blocks...............26841321
Total Blocks...............178653696
Highest LSN.......0x00000005BB492FBF
Lowest LSN........0x0000000000000000
Object Tree Entries................6
Object Special Entries.............7
Salvage Tree Entries.........1110891
------ Pool System Report ------
Total Blocks................178653696
Blocks in Use...............151812375
Purgeable Blocks.............14567746
Non-Purgeable blocks................0
Pool Info Blocks....................2
----- Salvage System Report ----
Salvage Entries W/O IDs.............0
Salvage Entries W/O names...........0
Salvage Parents W/O IDs.............0
Salvage Parents W/O names...........0
Salvagable Objects..................0
Salvagable Blocks............14567746
Object Tree Levels..................2
Name Tree Levels....................1
Salvage Tree Levels.................4
User Rest. Tree Levels..............1
MFL Tree Levels.....................0
Logical Vols NOT Verified...........0
10 Replies


Cadet 1st Class
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-08-29
04:24
Hi,
You need to determine what trigegred the pool to deactivate. Its usually going to be I/O failures on the storage. This can be a flaky controller, flaky disk, etc. I have never had a pool deactivate without an underlying hardware issue.
So what sort of storage is this? iSCSI? Directly attached? What server hardware are you using? Some details allong those lines might be helpful. Pay close attention to the console screens.... if there is a storage issue, you should see NSS errors indication an I/O error.
Are you getting abends from this server?
-- Bob
You need to determine what trigegred the pool to deactivate. Its usually going to be I/O failures on the storage. This can be a flaky controller, flaky disk, etc. I have never had a pool deactivate without an underlying hardware issue.
So what sort of storage is this? iSCSI? Directly attached? What server hardware are you using? Some details allong those lines might be helpful. Pay close attention to the console screens.... if there is a storage issue, you should see NSS errors indication an I/O error.
Are you getting abends from this server?
-- Bob
ataubman

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-08-30
01:24
I would certainly run an NSS /poolrebuild /purge . That way the errors verify finds will be cleaned up if possible, and the fixes in the new NSS code you have will be actually applied to that pool. In my experience the risk of data loss is minuscule (although not zero).
Andrew C Taubman (Sorry, support is not provided via e-mail) Opinions expressed above are not necessarily those of Micro Focus.
Andrew C Taubman (Sorry, support is not provided via e-mail) Opinions expressed above are not necessarily those of Micro Focus.
mark-v

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-08-30
02:19
Thanks. No, there have not been any abends recently and sadly I did not capture all the errrors on the console. What it seemed to amount to was that the array cntrl detected a problem writing to some part of the Pool and it decided to shut itself down, which in turn initiated the pool deactivation.
No logs were written that I can check as the pool and volumes were dismounted
Possibly this is an Array controler issue then?
There are no hardware (HP Proliant DL380/G7) logs recorded anywhere that indicate any issues with the hardware but gettingthe array ctrl replaced is another option too at this stage I guess.
However, my main concern is with running the pool rebuild. I can restore data on Vol1 but since SYS is also on this pool any loss of data from SYS could potentially hose the server....
Mark
No logs were written that I can check as the pool and volumes were dismounted
Possibly this is an Array controler issue then?
There are no hardware (HP Proliant DL380/G7) logs recorded anywhere that indicate any issues with the hardware but gettingthe array ctrl replaced is another option too at this stage I guess.
However, my main concern is with running the pool rebuild. I can restore data on Vol1 but since SYS is also on this pool any loss of data from SYS could potentially hose the server....
Mark
ataubman

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-08-31
04:38
If the rebuild costs you data, that data was probably inaccessible anyway. In what 15 years of doing NSS support I can recall maybe a handful of cases of data loss from rebuilds, I really would go for it.
Andrew C Taubman (Sorry, support is not provided via e-mail) Opinions expressed above are not necessarily those of Micro Focus.
Andrew C Taubman (Sorry, support is not provided via e-mail) Opinions expressed above are not necessarily those of Micro Focus.
mark-v

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-09-03
03:06
Soooo . . . .
From all the warnings that there will pretty much always be some "data loss" .. would you confidently say that it would not include any currently accesible valid data? In other words no files in on SYS would be lost thus preventing the server from restarting correctly?
I think I will still log a SR with Novell (which means you will most likely get it anyway 🙂 to check out the latest "nss/verify" logfile and advise.
Mark
From all the warnings that there will pretty much always be some "data loss" .. would you confidently say that it would not include any currently accesible valid data? In other words no files in on SYS would be lost thus preventing the server from restarting correctly?
I think I will still log a SR with Novell (which means you will most likely get it anyway 🙂 to check out the latest "nss/verify" logfile and advise.
Mark
ataubman

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2012-09-05
00:35
From all the warnings that there will pretty much always be some "data loss" ..
That's the opposite of what I said:
me wrote:
In what 15 years of doing NSS support I can recall maybe a handful of cases of data loss from rebuilds
would you confidently say that it would not include any currently accesible valid data? In other words no files in on SYS would be lost thus preventing the server from restarting correctly?
Pretty sure, yes, but I can't guarantee that 100% of course.
I think I will still log a SR with Novell (which means you will most likely get it anyway 🙂 to check out the latest "nss/verify" logfile and advise.
OK, but you'll be running a pool rebuild anyway - it's the only tool there is to fix a corrupt NSS structure.
Andrew C Taubman (Sorry, support is not provided via e-mail) Opinions expressed above are not necessarily those of Micro Focus.
rdy1

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2013-11-22
22:38
ataubman;2217090 wrote:
That's the opposite of what I said:
Pretty sure, yes, but I can't guarantee that 100% of course.
OK, but you'll be running a pool rebuild anyway - it's the only tool there is to fix a corrupt NSS structure.
i currently have a server in this same situation, but what trouble me is when its up with the rebuild screen, the progress bar is idle with no changes after 18 hours or so. I hope this is not normal ?
ataubman

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2013-11-24
22:33
rdy1;2295107 wrote:
i currently have a server in this same situation, but what trouble me is when its up with the rebuild screen, the progress bar is idle with no changes after 18 hours or so. I hope this is not normal ?
Certainly not, no, and I've not see it before in all these years. I assume you are running it because there's been some problem with a pool, and this hang presumably means it's not fixable. I can't see a way past deleting, recreating and restoring.
Andrew C Taubman (Sorry, support is not provided via e-mail) Opinions expressed above are not necessarily those of Micro Focus.
rdy1

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2013-11-26
03:15
ataubman;2295179 wrote:
Certainly not, no, and I've not see it before in all these years. I assume you are running it because there's been some problem with a pool, and this hang presumably means it's not fixable. I can't see a way past deleting, recreating and restoring.
that is very bad news, awaiting my SR & hope for the best
rdy1

Absent Member.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2013-12-04
23:41
ataubman;2295179 wrote:
Certainly not, no, and I've not see it before in all these years. I assume you are running it because there's been some problem with a pool, and this hang presumably means it's not fixable. I can't see a way past deleting, recreating and restoring.
There is still a way Andrew. For those who in dire-strait of cant mount your pool, dont give up. Alway a way.