Server Crash
On Monday (Feb 26) the server which hosts my site (and quite a few others, including NZLUG) suffered a power outage when the data centre it lives in had a major power failure.
A few hours later, when the power was restored the server would not boot. It powered up, detected the drives but wouldn’t boot at all. The boot loader wouldn’t even start.
After much asking around we managed to find someone, Liz, who could make a site visit for us (one admin, Nic, lives 600km away, and I was too busy at work to get away) - when Liz arrived and was unable to bring the server to life, she transfered the drives to her server in the same datacentre and Nic began the process of moving the data to a new server (which had been in waiting for a while).
However it became apparent that the primary drive (with all system configuration files) was dead. Specifically it would not spin up. The system detected it correctly, but it was not spinning at all.
The next day Liz visited the server again and applied some percussive maintenance (as suggested by Nic) - specifically she hit it with a screwdriver as it booted up. This worked on the second attempt and the drive spun up. There was some minor data loss, but overall everything came up and started working.
With the config files available again the new server was able to be setup with all the existing settings and take over from the defunct server.
So, why did the drive fail?
Because it was old. Very old by computer standards. Some background - the short history of Wibble:
- Mid-Late 1997 - Myself, Nic and a couple of friends decide to build a server. Wibble is born - it is a Pentium 166MHz, with 64MB of RAM and a Quantum Fireball ST4.3S SCSI hard drive.
- 2001 - Due to increasing disk and processor demands, Wibble was again upgraded. It received a new motherboard, CPU, RAM and an additional 20GB IDE hard drive (to become /home). The new server is a Celeron 500MHz, with 128MB RAM, now boasting a 4GB SCSI root drive, and 20GB for user home directories.
- Dec 2006 - Knowing that things are getting a little old, we arrange a new server. A dual Xeon 2.4GHz with 1GB of RAM and, initially, an 80GB mirrored RAID. All services will be migrated when time allows.
- Feb 2007 - Power fails. The almost 10-year-old Quantum Fireball drive fails to spin up when power is restored. Server migration is forced.
So, when the drive reached the end of it’s useful life it had been in constant operation for probably at least 80,000 hours, with no more than a few hours downtime from time to time.
About this entry
You’re currently reading “Server Crash,” an entry on Tastes Like Chicken
- Published:
- 28.02.07 / 10pm
- Category:
- General Ramblings
No comments
Jump to comment form | comments rss [?] | trackback uri [?]