[clue-talk] BAARF - Battle Against Any Raid Five (er, Four, er Free.)

Sun Oct 21 12:10:17 MDT 2007

On Oct 21, 2007, at 7:55 AM, Angelo Bertolli wrote:

> On Fri, October 19, 2007 10:50 pm, Jed S. Baer wrote:
>> On Fri, 19 Oct 2007 19:44:28 -0600
>> Nate Duehr wrote:
>>
>>> Good technical detail about why RAID-5 isn't always (in fact rarely)
>>> the correct technical solution for disk redundancy...
>>>
>>> http://www.miracleas.com/BAARF/BAARF2.html
>>
>> I remember reading Cary Millsap's articles way back when. But I  
>> have to
>> wonder, if RAID5 is so bad, why is it still so popular?
>
>
> Because... it works.  I've started reading the site, and maybe it's
> supposed to be satirical.

I don't think it is.  The "executive summary" of the articles as a  
whole seems to be (to me anyway), "Instead of RAID 5, use a  
combination of RAID 0 (striping) to get the size you need and then  
lay RAID 1 over the top of it and mirror it to another drive or set  
of drives."  (They also talk about going the opposite direction, RAID  
1's with RAID 0 laid over the top.)

The one article has some interesting failure analysis in it (RAID 5  
is three times more likely to fail than RAID 01/10" but he doesn't  
cite his source.  I've seen other work done on it, and the math  
actually works out that way... he's "right" but the site does a poor  
job of documenting/proving it.

> Actually we use RAID6 where I work when we can,
> and RAID5 when we cannot.  Although RAID6 isn't one of the original
> standards, it's become a de-facto standandard meaning that you  
> essentially
> have double parity using Solomon codes.

Yeah, that "standard" is becoming more popular and avoids the "loss  
of more than one disk kills the entire RAID 5 array" argument.  It  
doesn't address the huge performance hit that's taken on writes,  
however.

> Now, keep in mind when you're reading these arguments that you're not
> SUPPOSED to take all of your drives and make one huge RAID5.   
> What's the
> point of only having the capability of failing only ONE drive?  But  
> like I
> said:  it works because people determine how many drives can at one  
> time
> can fail and build their RAID5's accordingly.  It's really  
> unnecessary to
> make everything a RAID1:  one out of two drives do NOT fail before  
> they
> can be replaced.

I've seen RAID 5's fail because more than one disk died, but they  
were systems that were not monitored correctly or at all.

I have no worries that people using RAID 5 for production systems ($$ 
$) *are* going to monitor their physical disk states correctly, or  
they WILL after they screw up.  (Pain is a great motivator.)

The more interesting topic embedded in those articles is that many of  
them were written many years ago, when RAID 5 first got "popular" and  
the articles kinda hint at a more interesting dynamic that occurred.   
Instead of engineers THINKING about whether or not RAID 5 was the  
"correct" solution, when it comes to write speed, people just plowed  
ahead and forced the hardware manufacturers of all modern disk sub- 
systems to add fast cache (memory) to their controllers to make up  
for the hit that was being taken on write performance for RAID 5.

One could argue that now that hardware-based systems all typically  
have that feature/functionality/"solution" that the argument the site  
makes is somewhat moot.  But...

Many Linux admins are doing all of this via md (software) and not in  
hardware.  So it probably behooves us to seriously look at whether  
the performance hit on writes is worth the RAID 5 "goodness" or if a  
RAID 10 setup would both perform better and have a slightly lower  
real-world failure rate.

(Honestly I have no opinion -- just stating some ideas for  
discussion.  Most of the systems I work on are in a RAID 0+1 or 1+0  
type of configuration these days, though -- just as an anecdote.  Or  
they're using a SAN where all the disk management is "offloaded" to  
something more "intelligent" than just a RAID setup... hot-spare  
disks automatically used, etc.  I'm interested in what others are  
doing, just out of curiosity.)

> Also, when you get into RAID hardware, you can set up a drive to be a
> global spare.  Therefore, even with RAID5 you can be sure that if  
> no one
> is there to replace a drive, you'll have some time before it's time  
> to get
> another one.

You can set up global spares on MOST RAID systems.  Older stuff  
didn't have it, sadly.  It still doesn't alleviate the need to  
properly monitor the hardware, though.

(And you still have to monitor the hardware even in RAID 0+1/1+0  
setups too, no doubt about that.)

> We also do use RAID1, but that's only in the case on individual  
> machines
> where we don't need huge amounts of storage.  The articles on the site
> seem to make a case for RAID5 for Oracle databases.  There may be a  
> case
> for that, but the idea that everyone should just be using RAID1  
> instead of
> RAID5 is pretty silly.

I'm not sure that technically it's "silly" -- they just don't state  
their case very clearly or concisely with citations for the math  
involved in the failure risk analysis.  I know I've seen it  
somewhere, but I can't find it right at the moment... the bookmarks  
file is unorganized and out of control again (heh) and GoogleFu is  
lacking today.  (GRIN)

Basically they're saying that a well-designed multiple disk RAID 0+1/1 
+0 system will smoke a RAID 5 on write speed in most cases, and will  
exhibit less risk of failure taking down the system.  I'm not sure I  
disagree, but with the MTBF of disks going up quite a bit (well, it  
seems that they have anyway, but I can't cite that either -- just  
personal "evidence"), if either is monitored correctly, a fix can  
almost always be deployed before anyone cares or notices that uses  
the system.

The really interesting analysis lies in the performance hit... I  
think, anyway.  Having used both systems in commercial environments,  
even the performance hit is somewhat moot -- companies never ask the  
sysadmin/engineer to redesign the system layout to "fix" performance  
problems anymore, they just immediately start "throwing money at the  
problem" and buying bigger/faster disk sub-systems before anyone has  
time to do the analysis, if the system is making $.

Only us "margin" users of RAID on our home/small-business Linux  
systems (especially those of us who don't have big company budgets)  
might gain some "performance-Fu" from thinking a bit about this.  And  
we're all pretty likely to ignore it until the next time we build/ 
rebuild a system, anyway... since reality means that other things are  
usually more important to do... (GRIN).

I'm sure someone will feel strongly about this, now that I've thrown  
out all this drek on the topic.

Anyone?  Bueller?

:-)

--
Nate Duehr
nate at natetech.com