[CLUE-Tech] RAID-5

Adam Bultman adamb at glaven.org
Sat Oct 25 17:36:14 MDT 2003


> 
> My question is: what if I get a disk failure?  The excellent
> writeup on RAID-5 at http://www.acnc.com/04_01_05.html says
> "Disk failure has a medium impact on throughput."  Seems
> to me if a disk fails, it's all over.  It only uses parity,
> and parity can't tell you how to fix bad data, only that
> some bit is incorrect.

With a RAID 5, when you get a disk failure, your RAID runs fine - although
it is degraded in both speed and reliability.  The SCSI drives themselves
know they are having problems, and can report failures.  Since you have a
RAID-5, each drive contains enough parity information to lose one drive
and still 'know' what the data is, using it's brothers - it uses the rest
of the data to reconstruct the missing information.  The card will do it
on the fly, which is why you'll be operating slower than normal.  I
recently had a hard drive fail in a VA linux machine (last weekend,
actually.) The SCSI card reported it as bad, and it ran for most of the
week with 3 of 4 drives - and ran well enough so that no one noticed any
performance problems.  I ordered a drive (NOT at my leisure - a RAID
missing two drives cannot reconstruct information), popped it in, and the
RAID card noticed first that the dead drive was removed, and then that a
new drive was added.  It then proceeded to use the information from the 3
good drives to reconstruct the data for the new drive.  10 minutes later,
I'm back in business.  

> 
> So maybe I should use RAID-1?  That is with mirroring and
> duplexing.  But even there if I get a disk starting to fail,
> the controller won't know who to believe, just that the two
> disks disagree.
> 

Mirroring is good, too. It requires fewer drives, but at a penalty of 
capacity. Instead of n-1 space, you have 1/2 the space there.  And if a 
drive DOES die, the controller will know which one failed - usually 
controllers can.  And again, the SCSI drives are smart enough to know that 
they are having problems, too.  So the card will notice a problem, 
identify it, and in all liklihood, tell you the problem (i.e. ID 0 
failure).  If you have a good enough RAID card, if you have a failed 
drive, you can put in a new drive, and it will recognize the new drive and 
rebuild mirror information automagically.  Some of the servers I manage 
have mirrors in them.  Takes a bit longer to boot after a crash, but it's 
nice to know that the data won't be lost unless my RAID card goes haywire 
and fries the drive.


> Also, if the controller is fixing things on one disk, how can 
> I find out and perhaps replace the degrading disk?  Is there
> a raidtools that knows about hardware controllers?
> 

Again, if your controller is smart enough, it'll do it itself.  The money 
is on having a smart enough RAID card to do that for you without your 
intervention.  When it comes to RAID cards, you really DO get what you pay 
for.  I'd strongly recommend against the cheaper RAID cards (adaptec 
2100s, DPT Decade, etc) because you PAY for it when you lose a drive (I 
spent 64 hours rebuilding a RAID with a 2100s a year or so ago, and that 
was in firmware - It could not do it in linux OR windows).


HTH, 

Adam



More information about the clue-tech mailing list