NASLite Network Attached Storage

www.serverelements.com
Task-specific simplicity with low hardware requirements.
It is currently Mon May 05, 2025 2:15 pm

All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Kernel panic
PostPosted: Tue Apr 08, 2008 9:12 am 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
My ultra-reliable Naslite server (which has been running 24/7 for several months without issue) has just started playing up.

This morning, in the middle of a file transfer to the server, an error message appeared indicating the transfer could not be completed, and all disks on the server became inaccessible. I've re-booted three times now, but it won't come back. On the third boot, I took the following (partial) notes. All was well until the filesystem check:
Code:
Disk 0: FAIL
Disk 1: FAIL
Disk 2: DONE
Disk 3
Unable to handle kernel NULL pointer dereference at virtual address 0000000C *pde = 00000000
oops=0000
CPU=0
EIP=0010: [<c012a899>]  Not tainted  [and a lot more snipped]
FAIL
Disk 4: DONE
Disk 5
Unable to handle kernel paging request at virtual address 819db829 printing eip c02a8a33
[a lot more snipped]
<0> Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

And that's where it freezes. It doesn't get as far as Disk 6 (the last disc). All suggestions welcome. As I said at the top, this has been a very stable system, and I haven't made any recent changes to the machine or the network.


Last edited by jdk on Tue Apr 08, 2008 7:36 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 3:06 pm 
Offline
Site Admin

Joined: Tue Jul 13, 2004 4:11 pm
Posts: 1771
Location: Server Elements
My guess is bad RAM. Reseat the RAM and run a test. If it passes, try booting the NAS again.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 3:12 pm 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
Thanks for the response. Does RAM "go bad"? (presuming it was good to begin with)


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 3:32 pm 
Offline
Site Admin

Joined: Tue Jul 13, 2004 4:11 pm
Posts: 1771
Location: Server Elements
Not generally, but it does happen.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 3:39 pm 
Offline

Joined: Sun Apr 02, 2006 9:05 pm
Posts: 1688
Location: Up State NY in the USA!!!!
He didn't say your RAM was bad. He told you to re-seat the DIMMs and try booting again, there is a difference. Remove them and then re-insert them into the sockets. The chance that the RAM is bad is very unlikely. There is also a chance that the processor even got jarred loose and it may need to be re-seated as well.

After trying that you may still have bad sectors on the drives since in a fall like that under operation the heads might have "Crashed" into the surface of the platters and damaged the data in those areas, maybe the platter/s and even the heads depending on the severity of the fall.

Mike


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 7:33 pm 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
mikeiver1 wrote:
After trying that you may still have bad sectors on the drives since in a fall like that under operation the heads might have "Crashed" into the surface of the platters and damaged the data in those areas, maybe the platter/s and even the heads depending on the severity of the fall.

Mike


I think my (very careless) choice of words may have led to a misunderstanding. When I said in my first post that the server "fell over", I simply meant that it stopped working -- i.e. a file transfer stopped in mid-transfer with an error message indicating the transfer could not be completed, and the system has since failed to boot. Apologies if I confused anyone. I have now edited the first post so that it is clear.

I did, in any case, re-seat the RAM, and have re-booted a few more times. Each time I seem to get a bit closer to a successful boot -- last time the filesystem check got as far as the last (seventh) disk before freezing on a kernel error (although two disks still failed the check). I'll keep trying.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 7:56 pm 
Offline

Joined: Sun Apr 02, 2006 9:05 pm
Posts: 1688
Location: Up State NY in the USA!!!!
Drop back to one known good DIMM and try to boot the box, you may have had a stick go bad on you.

Mike


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Tue Apr 08, 2008 9:18 pm 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
Curiouser and curiouser...I also started getting "out of memory - system halted" errors at bootup, which did also seem to suggest a RAM issue. So I swapped out the RAM for known good RAM (it's old PC133 DIMMs in a Celeron 700MHz based system -- 2x128MB of RAM -- 7 HDDs (3xIDE, 4xSATA), Seasonic 430w power supply).

And I'm still getting "out of memory - system halted" errors at bootup, and kernel panics if I get as far at the system file checks. I did manage to boot up once and get as far as the admin login. But although I was able to use the administration utility, the disks were still not available on the network, and I could not browse to the status information pages. One disk had failed to mount.

Then when I rebooted from the admin utility, I only got as far as an "out of memory" error.

I really am at a loss now. This is a system that has run fault-free for a long time. Power supply maybe? That has been suggested before in these forums in response to the "out of memory" error. I switched ACPI off and that has not helped.

EDIT: I've also tried burning a new CD from the Naslite .iso file in case that had developed errors. No change.

EDIT: I find that if I disable ACPI, I am less likely to get the out of memory messages -- but I still get the kernel panics at later points in the boot cycle.


Last edited by jdk on Wed Apr 09, 2008 8:01 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Wed Apr 09, 2008 9:30 am 
Offline
Site Admin

Joined: Tue Jul 13, 2004 4:11 pm
Posts: 1771
Location: Server Elements
Run a memory test using memtest (http://www.memtest.org/). Let it complete a few cycles before assuming RAM and most of your board is operational. If all that works, then try resetting your BIOS to default settings and try booting NASLite without any storage drives. Check the syslog and make sure all is OK. Then start adding the drives.

With things like that the only approach you can take is by using a process of elimination. Not sure if that makes sense, but it will help you isolate the general area where the problem may be originating from.

Hope that helps.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Wed Apr 09, 2008 8:00 pm 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
Okay, so I ran Memtest, and I had about 10 errors in the high-stress test (#5) and #8. Otherwise error-free. So I swapped out the memory for a different set -- and still had five or six errors in test #5 and #8.

So here's my logic -- please rip apart if incorrect. Given that the test failed in similar fashion on two different sets of RAM (both of which have been operating without error up to now), that means the RAM alone is not likely to be the problem. It is either:

(a) a compatibility issue -- unlikely as this system has been running without issue for at least a year.
OR
(b) other hardware is failing: motherboard, CPU or power supply (or SATA card?). Much more likely.

My hunch is that the most likely culprit is the motherboard, so I intend re-building the server with a new motherboard when I get a chance. If anyone thinks I'm taking the wrong approach, please say.

EDIT: After the Memtest session, I tried to boot the server using just a single HD (IDE). It started without any problem and is visible on the network. I can transfer files to it. I am now very confused. I'll guess I'll add back a disc at a time until it fails again...

EDIT: A few hours later, the (single disc) server vanished from the network again. So unless anyone has a better suggestion, I'll press on with replacing the motherboard...


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Thu Apr 10, 2008 3:03 am 
Offline

Joined: Fri Jan 12, 2007 4:27 am
Posts: 577
Location: Scotland
Reading this through, it seems more and more likely that you are having power supply issues. As power supplies get warmer, their capacity reduces, so a cold power supply may boot the machine, but a warm one may drop drives. The fact that you cold booted with one HDD and it later failed would go some way to support this. You have at least 6 drives and presumably a CD-ROM attached - what wattage and make is your power supply?


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Thu Apr 10, 2008 6:09 am 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
NickC wrote:
You have at least 6 drives and presumably a CD-ROM attached - what wattage and make is your power supply?

Seasonic 430 watts.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Thu Apr 10, 2008 6:13 am 
Offline

Joined: Fri Jan 12, 2007 4:27 am
Posts: 577
Location: Scotland
jdk wrote:
NickC wrote:
You have at least 6 drives and presumably a CD-ROM attached - what wattage and make is your power supply?
Seasonic 430 watts.
Is it comparatively new? Capacitor ageing will also reduce output over time.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Thu Apr 10, 2008 7:12 am 
Offline

Joined: Sat May 27, 2006 1:56 pm
Posts: 59
The power supply is two years old.


Top
 Profile  
 
 Post subject: Re: Kernel panic
PostPosted: Thu Apr 10, 2008 7:53 am 
Offline

Joined: Fri Jan 12, 2007 4:27 am
Posts: 577
Location: Scotland
Seasonic is a reputable brand, so a two year old unit should be fine (unless this is its second home and it was thrashed to within a milli-amp of its "life" in a previous high-spec PC.....).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 35 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group