I swear, my luck with hard drives is really rotten. I just lost the OS drive in my MythTV box, and that marks the second time in as many years (and the 3rd time total).
It shouldn't be surprising. I've got 8 drives in always-on systems, and I was sure to lose another eventually. It's just too bad it wasn't one from the RAIDz array.
Anyway, the last time I lost the primary (and at the time only) drive in my MythTV system, I responded by rebuilding the thing with RAID 1. It chugged along happily for a while with no issue.
At some point, I picked up a small form factor bare bones kit to replace the massive Dell tower that I had been using. In moving to the smaller kit, I was forced to sacrifice the second drive.
Of course, now, I pay the price.
Luckily, the price isn't that high. When I set up my RAIDz array a while back, I offloaded all of the actual media files onto that and exported them via NFS. A drive failure in the mythtv system itself doesn't cause me to lose any of those.
At the same time, I also configured bacula to back up everything else "important" to the raidz pool as well, and I rsync those backups to an external drive. This works remarkably well, and until now I've had no cause to use it.
I noticed the drive failure last night, when I tried to upload a newly ripped CD. I didn't have time to do anything then - I just hit the gentoo website and started downloading the latest live CD (since god knows where I put my old one) and told bacula to restore everything to the local filesystem.
This morning, I got up a bit early and swapped out the failed drive with the one that used to be its mirror. I briefly considered trying to recover a bootable system from the outdated mirror, but quickly thought better of it; the data was really stale and would have to be replaced anyway. Might as well just nuke it from orbit and do a bare metal restore.
Once I had the live CD booted, it was pretty straightforward to recover from there. The bacula restore job had finished the night before, so all I had to do was partition the replacement drive and rsync the backup over from the Solaris box.
Unfortunately, I had failed to backup the boot partition. Not a big problem, but I had to go back in and recreate that, building a new initrd and creating a new grub.conf. I also failed to create /dev/console and /dev/null on the actual / partition, which caused boot to fail until I went back and did so. Lessons learned there.
I also lost my large "scratch" partition. I tend to keep a collection of useless junk around, and in this case I had already decided that these things were acceptable losses in a recovery scenario. In a way, it's actually nice to have this cleaned out.
The total time from cracking the case to having the system fully running with the prior night's backup was approximately 3 hours. I know I'm probably not going to see 3 9's on my DVR, but that's not a bad turnaround time from my perspective.