[clue-tech] "rewindable" drive?

Nate Duehr nate at natetech.com
Thu Nov 20 14:02:09 MST 2008


David L. Willson wrote:
> Yes.  I should have said so to begin with.  The app is inherited, large, and complex.  We (the new dev team, not me personally) have looked it over, edited and blessed it, but we (me personally) are still full of rage, vitriol, and distrust, and rage and vitriol.  So we (me personally again) want an "insurance policy" for the unlikely case where the old and new devil-ope-ers missed something and some person-of-ethical-flexibility takes advantage thereof.
> 
> I want to be able to take an image, roll the system drive back to it's uncracked state, and carefully re-import the latest data from the moment when we shut the server down.

Does the app have a database?

How much disk space can be spared for keeping old information that's not 
online?

Even if one exists (a filesystem that handles every write like a 
database transaction), the overhead is really high to do that with 
everything written to the filesystem... can the box(es) handle it?

I've been holding back on thoughts where this problem sounds similar to 
others I've worked on, because it was going to take a while to type up.

A hardware or software "snapshot" solution would work, but there's added 
complexity if there's a DB involved.  (You have to quiet the DB before 
the snapshot... some DB's have commands to do this -- Oracle, etc -- 
some don't)...

But you have to answer the questions of how often do you need a snapshot 
(How tragic is it if you lose an hour of data?  A day?  A week?), and/or 
backup...

There's already good daily backups, right?  And you know how to recover 
from them and how long it takes, right?

It sounds like you're looking for something FASTER than that, and are 
willing to pay for the convenience versus paying someone to be on-call 
and just fire up the recovery process from the normal backups...?

If not... then just double-check and focus on the existing (or hopefully 
not NON-existing) backup schema for the overall system).

So... to answer your original question:

Veritas FS kinda can kinda do what you want, but not really.  Same with 
ZFS and some others, but nothing other than perhaps the previously 
mentioned Oracle product is really designed for "filesystem rollback", 
that I've seen.

Mostly this is because most applications just don't LIKE being rolled 
back like that, so it's rare to see the FS being the "location" in the 
system design block-diagram where a live roll-back system is implemented.

Filesystems are usually just backed up, and rollbacks/roll-forward type 
functionality is usually confined to the RDBMS system, whatever RDBMS is 
in use.  Filesystem snapshots either in custom hardware or via things 
like LVM are "nice" for quicker recovery if the disks have the space, 
but they're not a backup plan... they're just part of the continengency 
plan to make recovery faster... not usually the master backups.

(One shop I worked in snap-shotted the main SAN which was a NetApp Filer 
at various times throughout the day, and ran incremental tape backup of 
those snapshots nightly, and a full backup of the entire latest NetApp 
snapshot to a very fast tape carousel every Friday night that took much 
of the weekend to complete.  Nothing could capture "real-time" rollback 
with as much data as that SAN had in it.  It was too big.)

When you start hitting problems where the filesystem containing the 
application is being attacked, that's way out of control, security-wise, 
and the efforts to limit that exposure need to be much more the focus.

Even mounting the applications themselves read-only from media that 
can't be written to, can be one way to handle it, if you can't keep the 
crackers out of the machine... put the app on a CD-ROM even.  (LOL... 
sounds crazy, but it'd work... of course the crackers can always mess 
with mount points if they're getting ROOT access.  If you're concerned 
about THAT happening, the whole thing is GAME OVER.  You can't protect 
that machine or its users in any reasonable sense of the term "protect", 
  and shouldn't bother trying.  A security re-design is the only way out 
of hell, in that extreme circumstance.)

Generally... my head sees something is "wrong" with your request in a 
normal environment.  Your reply seems to indicate that there's a "new" 
and "old" admin team, and that leads me to think there's some nasty 
political issues going on somewhere...?

Perhaps even all out "warfare" between some disgruntled folks and the 
new group, or maybe just the serious threat that type of thing might 
happen?

If that's the case... fix social problems with social solutions, not 
tech.  Make sure the appropriate company lawyers on retainer are up to 
speed and on-call.

At some point, you do have to be ready to sue the bejeezus out of the 
"old" team if they mess with the business.

Threatening their lives/livelihood with the embarassment of being 
arrested might be FAR more effective than worrying too much about the 
systems... they won't mess with the systems if they KNOW you're prepared 
to send them to jail... been there, done that.

Consulted once for a business owner who fired a sysadmin, the sysadmin 
retaliated and destroyed data and locked the owner out of his systems.

Talking to the lawyers and law enforcement about the options was very 
interesting.  Eventually it blew over, but... being prepared to provide 
forensic evidence with a clearly non-tamperable data chain was quite a 
challenge.

Eventually the business owner just wanted one particularly badly hacked 
box flattened, reloaded by his new sysadmin, and all outside access 
locked down, with monitoring added to watch for "strange" traffic or any 
back-doors in the application code, with a side-project of the new 
admin/developer to read EVERY line of source looking for that kind of 
shenanigans.

Nothing came of any of it after the old sysadmin found another job... 
but we watched him running Nessus and other attempts for a week or 
three.  If he'd carried a grudge, the lawyer was ready to file charges 
and copy his new employer... knowing that would stop the attacks, one 
way or another.

Okay with all that said, I hope that's not the depth of the situation 
you're in.  Those aren't all that fun because of the unknowns 
involved... watching/waiting for a disgruntled person to attack a box, 
is stressful, even if you know you've got the guns trained and loaded to 
fire back at them, since you know there's no good outcome from it for 
them or you... you're still stuck fixing the systems, and you also are 
probably going to mess up their life for a while if things are REALLY bad.

Nate


More information about the clue-tech mailing list