A few of days ago I began experiencing multiple error messages with a failing hard disk on my Windows Home Server (WHS). Dealing with a bad drive can be a pain – particularly with the downtime experienced – but it is for scenarios like this that I like to have duplication turned on for all my WHS folders …
The first sign of trouble was that my Media Center could not access and files on WHS. I checked Event Viewer and found lots and lots of disk errors like this:
The device, \Device\Harddisk2, has a bad block.
The Event ID was 7.
At this point I was already anticipating that I would probably have to replace the drive. For good measure I ran chkdsk /r to check all my drives and rebooted.
Then I started to see tons of errors that looked like this:
File record segment 10001 is unreadable File record segment 10002 is unreadable
Once these had finished checkdisk started to repair these issues but that process hung so I had to give up on that.
At this point I turned my WHS off until the replacement drive arrived (I added the drive to the storage pool without any issues).
As I mentioned earlier I have duplication turned on for all my folders on WHS – for me it is a small price to pay for being able to have WHS rebuild my data from the duplicate files and get me back to where I was without too much fuss. It does however take a while to do this.
With a new drive installed I set about trying to remove the bad drive from the storage pool. Event Viewer told me that it was harddisk 2 that was having issues and thanks to my previous organization this was drive 2 in my tower connected to Sata cable number 2 on my motherboard.
I was also pretty sure that harddisk 2 was the second disk listed in the Storage tab of the WHS console – but I was not 100% confident as I had other drives with the same name listed in my storage pool too. So I downloaded and installed the free version of HDTune to double check. Sure enough the second drive in my HDTune list did not respond when I tried to list it in HDTune. HDTune let me get the serial numbers for all the working drives and by a process of elimination I used this to double-check the problem drive (I have the serial numbers written on the rear of each drive so that I can see them when I open the case).
I hoped to be able to remove the problem drive with a few clicks in WHS but I found that WHS could not remove the drive due to “file conflicts”. So I shutdown and physically disconnected the drive.
With the drive disconnected I rebooted WHS and tried to remove the now missing drive from the pool. Again I got an error message about file conflicts. I had a look around and saw that WHS was calculating sizes in the Storage tab (which I figured that was to be expected). However, when I clicked on the Network Critical button I found that I was getting an alert for each folder that contained files from the ‘missing’ drive that I had removed. I had to wait for WHS to work through all the files and folders that it expected to see on the missing drive before it would begin removing the missing drive from the storage pool.
Even then this process failed due to file conflicts. The culprits I found were my Media Center and the online backup software that I had installed on WHS. I shut these both down and rebooted WHS and finally the missing drive could be successfully removed.
The drive that I removed was a 2TB drive and it took a long time for WHS to repair itself. I probably had about 5 days of downtime in total which is far from great.
Having WHS repair itself from folder duplication saved me a lot of hassle though as there is nothing like trying to organize a couple of TB of files from a backup.
The only thing that I lost were the backups of my Windows computers. I plan to install an add-in called Windows Home Server Backup Database-Backup (BDBB) so that I can backup my backups to a network share on another machine and/or enable duplication on my WHS.
For now I am just happy that WHS did its job. Folder duplication can be a life saver when a drive fails – but I still have a backup (offsite) of my most critical data.