[CLUE-Tech] RAID-5

Sun Oct 26 05:50:40 MST 2003

On Sat, 25 Oct 2003 19:36:14 -0400 (EDT)
Adam Bultman <adamb at glaven.org> wrote:

> With a RAID 5, when you get a disk failure, your RAID runs fine - although
> it is degraded in both speed and reliability.  The SCSI drives themselves
> know they are having problems, and can report failures.  

How do they do that?  Does something go in /var/log/messages?

> Since you have a
> RAID-5, each drive contains enough parity information to lose one drive
> and still 'know' what the data is, using it's brothers - it uses the rest
> of the data to reconstruct the missing information.  The card will do it
> on the fly, which is why you'll be operating slower than normal.  

I'm hung up on the math of this.  I have three 9G drives, one on each
physical channel.  If they were all data there would be 27G.  `dmesg` shows:

Oct 25 17:13:40 icicle kernel: scsi0: scanning virtual channel 0 for logical drives.
Oct 25 17:13:40 icicle kernel:   Vendor: MegaRAID  Model: LD0 RAID5 17364R  Rev:   A
Oct 25 17:13:40 icicle kernel:   Type:   Direct-Access                      ANSI SCSI revision: 02
Oct 25 17:13:40 icicle kernel: scsi0: scanning physical channel 0 for devices.
Oct 25 17:13:40 icicle kernel: scsi0: scanning physical channel 1 for devices.
Oct 25 17:13:40 icicle kernel: scsi0: scanning physical channel 2 for devices.
Oct 25 17:13:40 icicle kernel: Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Oct 25 17:13:40 icicle kernel: SCSI device sda: 35561472 512-byte hdwr sectors (18207 MB)

And with the filesystem:
/dev/sda1             17646824   3098096  14548728  18% /

So two-thirds of the total disk capacity (18G) holds data and the other third is being used for 
redundancy.  It seems that the one-third being used for redundancy can correct only an equal
amount of bad data on the other two-thirds of the capacity, or about half of the 18G.

> I recently had a hard drive fail in a VA linux machine (last weekend,
> actually.) The SCSI card reported it as bad, and it ran for most of the
> week with 3 of 4 drives - and ran well enough so that no one noticed any
> performance problems.  I ordered a drive (NOT at my leisure - a RAID
> missing two drives cannot reconstruct information), popped it in, and the
> RAID card noticed first that the dead drive was removed, and then that a
> new drive was added.  It then proceeded to use the information from the 3
> good drives to reconstruct the data for the new drive.  10 minutes later,
> I'm back in business.  

I tried something similar. I shut down the system, took out one of the
three drives and put an identical one in and rebooted.  It noticed that
the drive was changed when I booted, but it did not recover.  It gave me
two choices: proceed or go into diagnostic mode.  First I did diagnostic
mode and found nothing useful.  I rebooted and selected 'proceed' and it
could not boot.  It just sat quietly, with no apparent disk activity
for five minutes or so.

> > So maybe I should use RAID-1?  That is with mirroring and
> > duplexing.  But even there if I get a disk starting to fail,
> > the controller won't know who to believe, just that the two
> > disks disagree.
> > 
> 
> Mirroring is good, too. It requires fewer drives, but at a penalty of 
> capacity. Instead of n-1 space, you have 1/2 the space there.  And if a 
> drive DOES die, the controller will know which one failed - usually 
> controllers can.  And again, the SCSI drives are smart enough to know that 
> they are having problems, too.  So the card will notice a problem, 
> identify it, and in all liklihood, tell you the problem (i.e. ID 0 
> failure).  If you have a good enough RAID card, if you have a failed 
> drive, you can put in a new drive, and it will recognize the new drive and 
> rebuild mirror information automagically.  Some of the servers I manage 
> have mirrors in them.  Takes a bit longer to boot after a crash, but it's 
> nice to know that the data won't be lost unless my RAID card goes haywire 
> and fries the drive.

Maybe buried in there is the key: "the SCSI drives are smart enough to
know they are having problems."  If I have a mirror (RAID 1) I can see that
there is another copy of every data byte and the 50% capacity makes sense.
The controller I have allows RAID 0, 1 and 5 and it seems if you have three
drives you would always use RAID 5; two drives would use RAID 1.

Still, I can see how two drives keeps 1/2 of the total data backed-up
but three drives keeping 2/3 of the data backed up seems like something
for nothing.

> > Also, if the controller is fixing things on one disk, how can 
> > I find out and perhaps replace the degrading disk?  Is there
> > a raidtools that knows about hardware controllers?
> 
> Again, if your controller is smart enough, it'll do it itself.  The money 
> is on having a smart enough RAID card to do that for you without your 
> intervention.  

But if a drive is getting flakey, I'd sure like to know.  I don't want it
to be completely silent about it.  I'd rather replace a drive that's giving
a few errors a day than wait until it's completely gone.

> When it comes to RAID cards, you really DO get what you pay 
> for.  I'd strongly recommend against the cheaper RAID cards (adaptec 
> 2100s, DPT Decade, etc) because you PAY for it when you lose a drive (I 
> spent 64 hours rebuilding a RAID with a 2100s a year or so ago, and that 
> was in firmware - It could not do it in linux OR windows).

I'd sure like to watch (demo at school) this automagic repair in action.
Should I be able to power down cleanly, remove a drive, put in a physically
identical drive and have it recreate the drive I removed?

Adam, thanks for your feedback on this.

-- 
Roger Frank                                        rfrank at rfrank.net   
http://www.rfrank.net        Ponderosa High School, Parker, Colorado