Sunday, December 2, 2007

RAID Backup

Working perfectly!

Usually, a person needs a backup when their disk drive fails. All disk drives fail sometime - there is no escape from that truth. But there are other reasons for keeping good backups:
  • Total disaster, such as a fire or flood that destroys the whole computer and all nearby backups.
  • Deliberate mischief, such as a virus that deletes important files.
  • Accidental deletion or modification of one or more files.
I'm sure there are more reasons, but if we cover these we'll probably have the rest covered.

Drive Failure:

Disk drive failure can mostly be avoided by using two mirrored drives in a configuration known as RAID 1. RAID means Redundant Array of Independent Drives, and has several well-defined levels. RAID 1 is a simple comfiguration with two drives which always contain exactly the same information, hence the term "mirrored." If either drive fails, the other simply becomes the system's sole drive and takes over without a hitch. Since the probability of two drives failing at once is very small, RAID 1 pretty well covers that problem. The new computer here employs RAID 1.

Total Disaster:

If the building burns down or floods, the only solution is to have a separate backup stored offsite. This can be on the internet, another building some distance away, or perhaps in a fire- and water-proof safe. At this office a flood is highly unlikely, so we store encrypted DVD backups of most user files in a fire-resistant safe in the basement, and we occasionally put a DVD in a safe deposit box at the bank. I have just set up an upload account and I may stop putting DVDs in the safe deposit box. We'll see.

Deliberate Mischief, or Accidental Deletion or Modification:

RAID disks don't help here, because the RAID disk controller keeps the two mirrored disks identical even when the files themselves are deleted or corrupted. This is where Windows System Restore can be very handy indeed. I have several times seen a serious problem solved by restoring a system to a previous date and time. System Restore works, though it has the disadvantage that the whole drive reverts to a selected time in the past, even if you only need to recover one file.

Intel Storage Console rebuilding a RAID volume But if System Restore isn't the solution, then backups are the answer. DVD and internet backups can be used to restore user data, but what about all of the rest of the system? I started a full backup once, but quit when the backup wizard pointed out that I would need 19 DVDs. Enter "RAID Backup" with a third identical disk drive. At some reasonable interval (every day, every week, every month) I can disconnect the power to one of the two mirrored disks and connect the third disk. The disconnected disk is instantly a complete backup of everything, and the newly-connected disk will soon be overwritten and re-mirrored to the remaining good disk in the RAID 1 pair. Voila - complete backup in about five minutes for a one-time cost of about $80. It does actually take about 2 hours and 15 minutes to re-mirror, but the system is usable, if slower, while that takes place. And the third disk, with no power, is safe from any mischief.

Intel Storage Console showing the RAID volume rebuilt It Works!:

I wasn't entirely sure that the Intel software would be totally cool with what I wanted to do, but I tried it last night and today. The system has three identical 320 Mb Western Digital hard disk. Steps in the experiment:
  • Disk Drives A and B were mirrored, drive C was powered up as a spare but had never been used.
  • I shut down the computer, disconnected power on B, rebooted the computer. The Bios complained that the RAID 1 pair was "degraded" and gave me a chance to deal with it in the Bios, but I declined and let the bootup proceed.
  • The computer booted normally, and the Intel monitor software presented a pop-up balloon that said the RAID 1 disk was degraded but could be repaired.
  • I clicked on the balloon and followed the instructions to restore disk C to mirror the good disk in the RAID pair, disk A. Two and a quarter hours later, A & C were a mirrored RAID pair and B was a complete backup. Job done.
  • As an experiment, however, I shut down again and disconnected all EXCEPT disk B, then rebooted. Again the Bios complained and the on-line software did too, but the system functioned normally on just the "backup" disk. As far as I could tell, all files were accessible. The RAID software, apparently confused, also created a second RAID array at this point, consisting of Disk B and a "missing" disk. Duh.
  • I rebooted with only A & C connected, and everything worked once again, no complaints.
  • Then I connected B as well, rebooted, and got some complaints about a degraded pair in the second RAID array (disk B), but the system ran normally and all files on all disks seemed to be accessible, including the files on disk B.
  • Finally, I disconnected disk C, leaving A & B connected, and rebooted once again. The Bios and the Intel application software both complained about degraded RAID arrays. But it allowed me to delete the second RAID array, consisting of only disk B. That done, it allowed me to re-mirror B to the good disk in the original RAID pair, disk A, even though disk B contained lots of valid data. I was concerned that it might not let me destroy data, and I think there were at least four warnings that data would be destroyed on disk B if I proceeded, but it finally let me do it. Now disk C is again the full backup and the system is back to a RAID array of disks A & B.
From now on the procedure will be much simpler: Shut down, disconnect B or C (whichever was connected), reconnect the disk that was disconnected, reboot, and tell the Intel application to restore the RAID array. The biggest hassle is moving the computer to a position where I can open the side panel and disconnect / reconnect drives. I can handle it.

Windows Experience Index:

Before these little experiments, the system's Windows Experience Index was 5.4, limited by the disk subscore of 5.4. I ran the tests several times. Since the experiments, the Windows Experience Index is 5.5, limited by both the processor and gaming graphics, with the disk subscore improving to 5.7. Why did the disk subscore go up from 5.4 to 5.7, using exactly the same disks? Only Microsoft knows.

No comments: