Anonymous_User Absent Member.
Absent Member.
1832 views

Is this really a hardware abend ?

Is this really a hardware abend or we need to apply driver patches ?


*************************************************
*********************************************************
Novell Open Enterprise Server, NetWare 6.5
PVER: 6.50.05

Server xxxxxxx halted Thursday, August 31, 2006 12:41:34.321 am
Abend 1 on P00: Server-5.70.05: Nonmaskable Interrupt Processor Exception
(Error code 000000A0)

Registers:
CS = 0060 DS = 007B ES = 007B FS = 007B GS = 007B SS = 0068
EAX = 4AD11A22 EBX = 00000000 ECX = BA6104C0 EDX = 833023CD
ESI = 00000000 EDI = 00000000 EBP = 00000000 ESP = BA7AEF78
EIP = 00210D9B FLAGS = 00000082
00210D9B 85F6 TEST ESI, ESI
EIP in SERVER.NLM at code start +0000FB5Bh

The violation occurred while processing the following instruction:
00210D9B 85F6 TEST ESI, ESI
00210D9D 0F8452010000 JZ 00210EF5
00210DA3 833D80D1030000 CMP [0003D180]=00000000, 00000000
00210DAA 0F8490030000 JZ 00211140
00210DB0 B201 MOV DL, 01
00210DB2 31DB XOR EBX, EBX
00210DB4 8815F8C14100 MOV [SERVER.NLM|AccelerateDelayedWorkToDoFlag]
=00
, DL
00210DBA A1A8C14100 MOV EAX, [SERVER.NLM|DelayedWorkToDoList]
=00000000
00210DBF 50 PUSH EAX
00210DC0 8B30 MOV ESI, [EAX]



Running process: Server 11 Process
Thread Owned by NLM: SERVER.NLM
Stack pointer: BA7AEF60
OS Stack limit: BA7A7020
CPU 0 (Thread BA7BA2E0) is in a NO SLEEP state
Scheduling priority: 67371008
Wait state: 50500F0 Waiting for work
Stack: --00000000 (LOADER.NLM|KernelAddressSpace+0)
--BA7BA2E0 (PEDGE3.HAM|printDeviceType+2858)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--BA7BA2E0 (PEDGE3.HAM|printDeviceType+2858)
0021DEC8 (SERVER.NLM|TcoNewSystemThreadEntryPoint+40)
--BA7BA2E0 (PEDGE3.HAM|printDeviceType+2858)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--34343434 ?
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--34343434 ?
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--00000000 (LOADER.NLM|KernelAddressSpace+0)
--55575653 ?
--8B68EC83 ?
--0084249C (PEDGE3.HAM|msgex+4028)
--D2310000 ?
--E430F631 ?
--5489ED31 ?
--89534824 ?
--88502454 ?
--E8682464 ?
--000081E4 (LOADER.NLM|KernelAddressSpace+81E4)
--E194158B ?
--C4830083 ?
--702D8904 ?
--310083E4 ?
--246C89C0 ?
--74D2855C ?
--245C8B1A ?
--C083435C ?
--245C8904 ?
--0CFB835C ?
--B8830973 ?
-0083E194 (PEDGE3.HAM|megaAdapterInfo+0)
--31E67500 ?
--8308EBC0 ?
--F88304C0 ?
--830F7430 ?
-83E194B8 ?
--EF740000 ?
--4C2444FF ?
--5C8BE9EB ?
--FB835C24 ?
--681F720C ?
-0083B000 (PEDGE3.HAM|(Data Start)+0)
--84248C8B ?
--51000000 ?
--B28BE7E8 ?
--FFFFB845 ?
--C483FFFF ?
--09ECE908 ?
--D8890000 ?
--0100000D ?
--10685000 ?
--68BA7B01 ?
BA7AFFC4 (PEDGE3.HAM|mega_HAM_Execute_HACB+0)
--7B080468 ?
--0F1468BA ?
--D468BA7B ?
--8BBA7AFA ?
--009424AC ?
--68550000 ?
--00000371 (LOADER.NLM|KernelAddressSpace+371)
--6424448D ?
--DD0AE850 ?
--C483FF59 ?
--0FC08524 ?
--0001DB84 (LOADER.NLM|BIOSDriveCount+3894)
--01F88300 ?
--01B0850F ?
--94680000 ?
--68000018 ?
-0083E474 (PEDGE3.HAM|msgex+0)
--4FE8006A ?
--8345BB0A ?
--74680CC4 ?
--B20083E4 ?
--24448B11 ?

Additional Information:
There may be some bad memory either on an adapter card or on the
motherboard. If the problem continues, try replacing the main system
memory or adapter cards to prevent future parity errors.

Labels (2)
0 Likes
13 Replies
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

Unless things have changed -- NMI's are very likely hardware --
typically bad memory somewhere in the mix.



--
Barry Schnur
Novell Support Connection Volunteer Sysop
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

Hi,

grpadmin@yahoo.com wrote:
>
> Is this really a hardware abend or we need to apply driver patches ?


The chances of this being a hardware problem are +95%.

CU,
--
Massimo Rosen
Novell Support Connection Sysop
No emails please!
http://www.cfc-it.de
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Antw: Is this really a hardware abend ?

Walter,

Walter Hofstädtler wrote:
>
> Hi,
>
> visit http://www.memtest.org/ for a free memory test program.


Not all NMIs are main memory related. In fact, these days many are in
fact PCI cards.

CU,
--
Massimo Rosen
Novell Support Connection Sysop
No emails please!
http://www.cfc-it.de
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Antw: Is this really a hardware abend ?

> Not all NMIs are main memory related. In fact, these days many are in
> fact PCI cards.
>

OK -- didn't know that PCI cards were climbing up the list for NMI's.


--
Barry Schnur
Novell Support Connection Volunteer Sysop
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

On Fri, 01 Sep 2006 00:40:46 -0700, Barry Schnur <BSchnur@cox.net> wrote:

> Unless things have changed -- NMI's are very likely hardware --
> typically bad memory somewhere in the mix.
>


The PEDGE's hw is often an on-the-motherboard-HBA, but if you have a
seperate HBA you (the OP, Barry's already been thru this) might that it is
properly seated in the socket.

/dps

--
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

> The PEDGE's hw is often an on-the-motherboard-HBA, but if you have a
> seperate HBA you (the OP, Barry's already been thru this) might that it is
> properly seated in the socket.
>

Good point -- the first thing I do if I run into an NMI is (hoping for
the simple stuff), open the server up, make sure that not only the
memory sticks, but also all cards and connectors are firmly seated in
place).



--
Barry Schnur
Novell Support Connection Volunteer Sysop
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

On Fri, 01 Sep 2006 15:42:35 -0700, Barry Schnur <BSchnur@cox.net> wrote:

> Good point -- the first thing I do if I run into an NMI is (hoping for
> the simple stuff), open the server up, make sure that not only the
> memory sticks, but also all cards and connectors are firmly seated in
> place).


I've had to do that after plugging a cable in a little too
enthusiastically =8-o

But starting with the simple things is often the way to retain sanity.

/dps

--
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

> I've had to do that after plugging a cable in a little too
> enthusiastically =8-o


Ah yes, taking out one's frustrations -- been there, done that.
>
> But starting with the simple things is often the way to retain sanity.


The little of it we have, eh? <smile>.


--
Barry Schnur
Novell Support Connection Volunteer Sysop
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Antw: Is this really a hardware abend ?

Massimo,

I agree with you, but running a memory test help to narrow it down - memory
error or PCI problem.

Walter


>>> Massimo Rosen<mrosenno@spamcfc-it.de> schrieb am Freitag, 01. September

2006 um 15:13 in Nachricht <44F831DD.C6686EBE@spamcfc-it.de>:
> Walter,
>
> Walter Hofstädtler wrote:
>>
>> Hi,
>>
>> visit http://www.memtest.org/ for a free memory test program.

>
> Not all NMIs are main memory related. In fact, these days many are in
> fact PCI cards.
>
> CU,

0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Antw: Is this really a hardware abend ?

Hi,

Walter Hofstädtler wrote:
>
> Massimo,
>
> I agree with you, but running a memory test help to narrow it down - memory
> error or PCI problem.


In theory. In reality, I don't believe in software based memory tests
running on probably broken hardware. In my experience, stuff like
memtest has a hit rate of a bit over 50%. AKA it's about as good as a
guess. Especially problematic is that if memtest (or the likes) come
back without error, it essentially means nothing. That usually leads to
admins firmly believing their memory can't be the culprit because
memtest comes back clean, when in reality it *is* the memory.

OTOH, *if* Memtest finds memory errors, it's usually right (although by
far not always). It can still be the CPU, the motherboard or the power
supply.

CU,
--
Massimo Rosen
Novell Product Support Forum Sysop
No emails please!
http://www.cfc-it.de
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Antw: Is this really a hardware abend ?

Barry,

Barry Schnur wrote:
>
> > Not all NMIs are main memory related. In fact, these days many are in
> > fact PCI cards.
> >

> OK -- didn't know that PCI cards were climbing up the list for NMI's.


They do, invarious different colours. First of all, the PCI bus also has
a NMI line, e.g every PCI card can trigger a NMI on it's own and on
purpose.

Additionally, most PCI cards have their own memory which gets mapped
into and over the normal main memory of the computer (at the very end of
the 4GB address space). As such, even a memory related NMI can in fact
be memory that in reality is physically located on a PCI card instead of
the computers main memory.

CU,
--
Massimo Rosen
Novell Product Support Forum Sysop
No emails please!
http://www.cfc-it.de
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Antw: Is this really a hardware abend ?

> They do, invarious different colours. First of all, the PCI bus also has
> a NMI line, e.g every PCI card can trigger a NMI on it's own and on
> purpose.
>
> Additionally, most PCI cards have their own memory which gets mapped
> into and over the normal main memory of the computer (at the very end of
> the 4GB address space). As such, even a memory related NMI can in fact
> be memory that in reality is physically located on a PCI card instead of
> the computers main memory.
>

OK -- so you would expect video and controller cards in particular,
with NIC's possible as well. I suppose if one had a server running
SLI and two video cards, you'd really be in trouble <smile>.



--
Barry Schnur
Novell Support Connection Volunteer Sysop
0 Likes
Anonymous_User Absent Member.
Absent Member.

Re: Is this really a hardware abend ?

Does PEDGE3.HAM always show in the abend stack traces for NMI abends? If
yes, it is posisble that your RAID controller is starting to have problems.

--
Marcel Cox (using XanaNews 1.18.1.3)
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.