[clue-tech] intentionally induce ext3 filesystem errors for testing recovery procedure?

Tue Apr 20 23:49:09 MDT 2010

Hi CLUEbies,

I'm working on some changes to rc.sysinit for an industrial computer 
system based on CentOS 5.2 or 5.3.  As shipped with CentOS the 
rc.sysinit checks the filesystems at startup and fsck exits with return 
code 1 (OK), 2 (fixed), 3 (fixed), and 4 (didn't fix).

I've already made most of the filesystems read-only using the readonly 
root stuff and /etc/rwtab.  Almost all of the read/write stuff (all of 
it is logfiles) is on a separate partition and not on the root 
filesystem.  I'm changing rc.sysinit around so that it will try more 
aggressively to fix the filesystems if there is some sort of error.  
Since the filesystems are mostly readonly I don't expect any errors on 
those, but the logfile partition could be badly corrupted.

In that case, the rc.sysinit is going to reformat the logfile partition 
and then reboot.  Hopefully on reboot the fsck will pass the logfile 
partition with an exit code less than 4.

I think I've got most of the logic figured out for how this needs to 
work.  But I need to induce moderate filesystem corruption so I can test 
all of the cases.  A google search made me think nobody has ever wanted 
to intentionally corrupt a filesystem.  Do any of you have any 
suggestions for what I could do?  Obviously I can dd garbage over the 
superblocks, but I think that will just make fsck exit 4.

Probably I will just wind up setting the variable in the script and not 
actually corrupting the filesystem. :(  But I was hoping to do 
end-to-end testing under real conditions...

Thanks,
Jim

-- 
Jim Ockers, P.Eng. (ockers at ockers.net)
Contact info: http://www.ockers.ca/pason.html

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cluedenver.org/pipermail/clue-tech/attachments/20100420/0d9f1cf5/attachment.html