George: Wow, this is Bad News. > I've got a situation, our primary server has lost 2 of > it's 4 scsi hard drives. We had a root partition > running RAID5 across 4 drives and it appears from IBMs > check disk utility that 2 of them are bad. We are > doing software raid using Redhat 7.1 and the 2.2 > kernel. We had our /boot partition mirrored, our > /root partition striped and our swap partition > mirrored. We need to at least be able to retrive the > content to put on our backup server before Monday. At > best to restore everything. We do have spare hds. > Question 1: IS it possible to recover our content from > such a failure? Well I'm not sure I understand the question. You say that the "root partition" was RAID5 across 4 drives. You later say that the "/root partition" was striped and other partitions were mirrored. Striping is RAID-0. Mirroring is RAID-1. RAID-5 is striping across n+1 disks with n disks' worth of disk space and the n+1'th disk redundant. So, which partition were you trying to recover? The / (root) partition is the root filesystem and it doesn't sound like you had it mirrored. The /root directory (partition?) is generally just root's home directory and there usually isn't much in there of impor- tance. You should be able to just recreate the /root directory and call it good, maybe put a few .bashrc scripts etc. in there. > Question 2: What utilities are out there to aid in > such a recovery. If you are using software RAID then you pretty much have to use the "ckraid --fix" utility that comes with the raidtools package. Red Hat provides the raidtools package as well as a bunch of kernel patches to make them work properly. Ckraid is very poorly documented and it may not be able to recover your data, but that's your only option if it was possible to recover your data. You could include the /etc/raidtab file and perhaps the output of "cat /proc/mdstat" on this system, if you can get the kernel running and system running enough to get that output. That would help the mailing list members in diagnosing the problem. I assume you don't have a tape backup of this system? I've used Linux Software RAID a bit and I've always wondered what would happen if you had a RAID5 disk failure but didn't notice it for a while, until there was a second disk failure and the whole thing was toast.. It seems like the only way to tell what's going on is to look at the /proc/mdstat file from time to time and see if all the disks are [UUUU] or not. I've never had a RAID5 disk failure yet so I don't know what it would say if one of the disks failed. > Question 3: This is a time critical issue, if there is > anyone out there who might be able to deliver > assistance, or if there are companies out there who > can handle recovery from this type of failure. Please > respond ASAP. Please call George @ 303-596-0417. I think your best bet is going to be to find a consultant or consulting organization which has experience with Linux software RAID. You could try TechAngle (www.techangle.com), the hosts of the CLUE mailing lists (and a local ISP and consulting company), since they do IT outsourcing and consulting work. -- Jim Ockers (ockers@ockers.net) Contact info: please see http://www.ockers.net/ Fight Spam! Join CAUCE (Coalition Against Unsolicited Commercial Email) at http://www.cauce.org/ .