[clue-tech] Multiple system backups
William
wlist-clue at kimballstuff.com
Sat Feb 11 19:55:41 MST 2006
David L. Anselmi wrote:
> Here you go, in no particular order. All of this is based on my
> preference and experience so don't take it as criticism but just
> something to think about before you revise the script next time.
It's cool, I like the feedback. We all learn when people with different
ideas come together in dialog. There are a few questions in here, so
I'll address all points to provide my thinking (this is neither
reactionary nor defensive, but exploratory).
> Rather than define a variable to keep the path for every command you
> call, I would set the $PATH at the beginning and just call the commands.
> If you insist on a variable for every command, don't call it FOO_PATH,
> just call if FOO. It's a command, not a path.
I feel like you missed my "Power User" note. :) I specifically did not
set a $PATH to provide a maximal user-control capability. Imagine this
script running chrooted, or a particular user (say, using Debian rather
than Red Hat) prefers alternative commands to the ones I chose to use.
Or, they prefer to supply particular other options to the commands over
and above my choices. I decided that this was a good way to provide a
very highly granular system of control to the (power) user. Ironically,
using the _PATH variable note is a legacy of my first run at this
control, where I was actually setting up a $PATH variable. It stuck
when I decided to list the individual commands for the reasons I just
specified. :) If the variable name is confusing or misleading, I can
certainly fix it with a quick substitution. :)
> You frequently say "delimited list". Since you mean "whitespace
> delimited list" I would just say "list".
I worked in the technical support department of a software company for
three years, and today I'm a software architect responsible for
extremely detailed specifications and painstakingly accurate
documentation. I don't short-hand anything when the meaning can be
expressed more precisely. This is to avoid as many forms of confusion
as I can before confusion leads to problem. As I'm sure you know,
"whitespace" can include far more than just "space". I specifically
targeted only one type of whitespace character.
> Debian doesn't have a service command for starting/stopping services, at
> least not on a typical system.
I wasn't aware of this. My only Linux experience is with Red Hat and
derivative products. That presents and interesting, though workable,
problem. Debian users could set the SERVICE_PATH variable to whatever
the equivalent is, if there is one (I hope there is, otherwise I don't
know how Debian users would handle services centrally).
> Keeping a version and history for each function seems excessive.
> Comments are nice but that stuff belongs in CVS or your changelog.
Most of these functions are portable to other, unrelated code projects.
I did that on purpose; I generally write code that can be very widely
reused (dubbed "generally useful"). On a personal note, I often wish
other developers would document there code as I have here. When I have
to bug-fix someone else code on-site, I loath digging through
"disconnected" documentation like change-files or CVS comments. This is
a personal preference.
> There's a lot of string manipulation going on. You should see whether
> perl would be a better choice.
I specifically selected shell script in order to learn shell scripting.
This entire project is an exercise for me and I already have a major
Perl project. :)
> Use install -d rather than writing makedirs(). Probably mkdir -p would
> work too.
My version also applies the chmod, which the others do not (as far as I
can tell). Additionally, the way I handle the component path elements
automatically cleans up otherwise unpredictable paths. For example, if
you pass "//some/dir////broken" to my function, it is automatically
cleaned up as "/some/dir/broken/".
> Looking through main(), you don't actually do much error checking.
> You'll exit if the backup list, tar file, or compressed tar file don't
> exist. But you don't check whether the tar command works, or the
> compression command. For other errors you'll exit 0. What if samba
> doesn't work or a service won't stop or start?
Actually, there is quite a lot of error checking in main(), though of a
different style than you seem to be looking for. I'm measuring output
rather than exit state in main(), although as you probably noted, I do
error-check the command exit states in my other functions in a style
you're probably looking for. This is deliberate, mainly because I want
to reverse the system-level changes I've caused as soon as possible,
regardless of error. If I open the samba connections, I want them
closed right away. If I disable services, I want them right back up
ASAP. I do not test whether services fail to stop because the list of
services can be quite long and I won't abort the whole operation for a
single failure (not to mention, users may put "service" names in the
list that are actually not services -- because I can't tell at run-time
whether the failure is user-driven or a true system failure, I choose to
ignore the failure altogether).
The backup operation is system-critical. If one or two services fail to
respond, it is better to get everything I can off the system as-is than
to get nothing for this reason: why did the service fail? The server
may be about to break... In this case, the user probably has a lot of
other alerts, messages, what have you, that I don't need to compound. I
made the decisions of where I abort-on-failure or ignore-failure or
print-failure-but-continue-anyway deliberately to minimize risk and
down-time during the backup operation. To the best of my ability, the
system is as it was before my script ran when it finishes.
A note on the samba failure question: I am testing the smbmount command
for failure, which -- as you can see from the way I abort with a user
message -- is a critical failure.
> It may be ok to continue but you might want to feed back that something
> wasn't right, so the user can do something about it. For example, I
> have a script that logs to a file and if anything goes wrong cron will
> mail me the file. Even if I wrap your script so cron doesn't mail me
> every time it runs, I can't tell whether I care about the output or not.
You can see where I redirect output vs. where I do not. If something
really does fail -- that is critical to the success of the backup
operation -- then the user will get a message from cron that night. If
the failure can be muted because the net result is not a critical
failure, then I mute it. In other cases, I mute out of necessity. For
example:
$TAR_PATH -cf "$workspace_tar_fqn" -T "$BACKUP_LIST_FILE" 2>/dev/null
This is an "undesired, but necessary mute" because tar outputs something
like "stripping leading slashes in file names..." when used with the -T
option. This message is entirely harmless and should not be reported in
the context I'm using it. Frankly, I'd rather trap whether tar fails
due to insufficient drive space, or something equally critical. Because
I'm redirecting the stderr output, I can't test on the command level --
the behavior is inconsistent in my experience. Put all this together,
and I had to error-check the way you see; by testing the expected output
of the command rather than the command's exit status.
> Your comments in main() could use improving. "Perform the backup" isnt'
> nearly as helpful as stating what to do if there are problems. In the
> places you've decided to continue, why is that the right thing to do?
Answered above. As for the comments, I don't understand the complaint.
I'm documenting almost at the per-line level. "Perform the backup"
immediately precedes the tar command (making it an obvious remark),
which is followed by (after the services are restored) the
error-checking code for that tar operation -- in the else condition, you
see the comment, "The backup tarball failed." There is no more
information that I can express to a maintenance programmer without being
overly redundant. :)
> Dave
Thanks for the feedback! I truly appreciate and good with any bad.
I've been programming for over 20 years, but this is my very first
"major" shell script. I realize that I have room to grow here. :)
--
William Kimball, Jr.
http://www.kimballstuff.com/
"Programming is an art-form that fights back!" (Unknown)
_______________________________________________
CLUE-tech mailing list
CLUE-tech at cluedenver.org
http://cluedenver.org/mailman/listinfo/clue-tech
More information about the clue-tech
mailing list