|
Okay, so here's the story for all you unix/linux geeks out there who want to know why this took so long... :)
After discovering that my motherboard and CPU bit the dust (apparently from heat damage), I figured it would be a simple procedure to use an older CPU/motherboard combo I had laying around (which is going to be used as part of a father's day gift for my dad - keep it quiet!) for a week until the replacement parts come in.
Well, the burned-out system was an AMD Athlon (K7) architecture and had a kernel optimized for the K7, meaning that it tries to do things that only the K7 and newer processors can do. The temporary system is an AMD K6 (no, I don't support Intel) which can not handle many of the K7 instructions. So, I knew going into this that I needed to boot the system with a different kernel.
This is where the surprises started. By using a rescue disk, it was easy enough to boot using a new kernel, but the system would immediately hang after loading the kernel. After doing some usenet searches, I found that I was using a buggy kernel version. I met with more success using a different version, which started detecting the hardware on startup but hung just before init (the mother of all processes in Unix) started. I was stumped at that point, and wasted a lot of time trying different kernels and initrd versions.
Eventually, I posted my problem to a local linux user group mailing list and announced it on the MANN downtime page, and was fortunate enough that David Hill read about it. He pointed me to a usenet post reminding me that recent releases of Redhat come with an optimized glibc library, and that I was likely using a glibc version that was optimized for the K7. Thus there was no way I could boot the system unless I replaced all of glibc by hand (not a fun possibility) or stick my hard drive into another K7 machine and then boot it up fine and replace the RPMs with pre-K7 versions. I was lazy, had some machines at work that would do the trick, and opted for the latter choice. I was then able to stick the disk back in the server and everything came up fine. In the process I also read a ton of documentation about bootloaders and the unix boot process. So it wasn't a total waste of time.
It was, however, a waste of more than $200 that I had to shell out to get replacement parts (including a more reputable CPU heatsink and fan). I really don't feel comfortable openly asking for money, but donations at this time would be appreciated more than ever.
And for all you sysadmins out there, beware of architecture-optimized software packages (especially the main system libraries)! It's a good idea in many ways, but it can have consequences in hardware emergencies like this.
|
|
|