[clue-tech] Multiple system backups

William wlist-clue at kimballstuff.com
Sat Feb 11 19:55:41 MST 2006


David L. Anselmi wrote:
> Here you go, in no particular order.  All of this is based on my 
> preference and experience so don't take it as criticism but just 
> something to think about before you revise the script next time.

It's cool, I like the feedback.  We all learn when people with different 
ideas come together in dialog.  There are a few questions in here, so 
I'll address all points to provide my thinking (this is neither 
reactionary nor defensive, but exploratory).

> Rather than define a variable to keep the path for every command you 
> call, I would set the $PATH at the beginning and just call the commands. 
>  If you insist on a variable for every command, don't call it FOO_PATH, 
> just call if FOO.  It's a command, not a path.

I feel like you missed my "Power User" note.  :)  I specifically did not 
set a $PATH to provide a maximal user-control capability.  Imagine this 
script running chrooted, or a particular user (say, using Debian rather 
than Red Hat) prefers alternative commands to the ones I chose to use. 
Or, they prefer to supply particular other options to the commands over 
and above my choices.  I decided that this was a good way to provide a 
very highly granular system of control to the (power) user.  Ironically, 
using the _PATH variable note is a legacy of my first run at this 
control, where I was actually setting up a $PATH variable.  It stuck 
when I decided to list the individual commands for the reasons I just 
specified.  :)  If the variable name is confusing or misleading, I can 
certainly fix it with a quick substitution.  :)

> You frequently say "delimited list".  Since you mean "whitespace 
> delimited list" I would just say "list".

I worked in the technical support department of a software company for 
three years, and today I'm a software architect responsible for 
extremely detailed specifications and painstakingly accurate 
documentation.  I don't short-hand anything when the meaning can be 
expressed more precisely.  This is to avoid as many forms of confusion 
as I can before confusion leads to problem.  As I'm sure you know, 
"whitespace" can include far more than just "space".  I specifically 
targeted only one type of whitespace character.

> Debian doesn't have a service command for starting/stopping services, at 
> least not on a typical system.

I wasn't aware of this.  My only Linux experience is with Red Hat and 
derivative products.  That presents and interesting, though workable, 
problem.  Debian users could set the SERVICE_PATH variable to whatever 
the equivalent is, if there is one (I hope there is, otherwise I don't 
know how Debian users would handle services centrally).

> Keeping a version and history for each function seems excessive. 
> Comments are nice but that stuff belongs in CVS or your changelog.

Most of these functions are portable to other, unrelated code projects. 
  I did that on purpose; I generally write code that can be very widely 
reused (dubbed "generally useful").  On a personal note, I often wish 
other developers would document there code as I have here.  When I have 
to bug-fix someone else code on-site, I loath digging through 
"disconnected" documentation like change-files or CVS comments.  This is 
a personal preference.

> There's a lot of string manipulation going on.  You should see whether 
> perl would be a better choice.

I specifically selected shell script in order to learn shell scripting. 
  This entire project is an exercise for me and I already have a major 
Perl project.  :)

> Use install -d rather than writing makedirs().  Probably mkdir -p would 
> work too.

My version also applies the chmod, which the others do not (as far as I 
can tell).  Additionally, the way I handle the component path elements 
automatically cleans up otherwise unpredictable paths.  For example, if 
you pass "//some/dir////broken" to my function, it is automatically 
cleaned up as "/some/dir/broken/".

> Looking through main(), you don't actually do much error checking. 
> You'll exit if the backup list, tar file, or compressed tar file don't 
> exist.  But you don't check whether the tar command works, or the 
> compression command.  For other errors you'll exit 0.  What if samba 
> doesn't work or a service won't stop or start?

Actually, there is quite a lot of error checking in main(), though of a 
different style than you seem to be looking for.  I'm measuring output 
rather than exit state in main(), although as you probably noted, I do 
error-check the command exit states in my other functions in a style 
you're probably looking for.  This is deliberate, mainly because I want 
to reverse the system-level changes I've caused as soon as possible, 
regardless of error.  If I open the samba connections, I want them 
closed right away.  If I disable services, I want them right back up 
ASAP.  I do not test whether services fail to stop because the list of 
services can be quite long and I won't abort the whole operation for a 
single failure (not to mention, users may put "service" names in the 
list that are actually not services -- because I can't tell at run-time 
whether the failure is user-driven or a true system failure, I choose to 
ignore the failure altogether).

The backup operation is system-critical.  If one or two services fail to 
respond, it is better to get everything I can off the system as-is than 
to get nothing for this reason:  why did the service fail?  The server 
may be about to break...  In this case, the user probably has a lot of 
other alerts, messages, what have you, that I don't need to compound.  I 
made the decisions of where I abort-on-failure or ignore-failure or 
print-failure-but-continue-anyway deliberately to minimize risk and 
down-time during the backup operation.  To the best of my ability, the 
system is as it was before my script ran when it finishes.

A note on the samba failure question:  I am testing the smbmount command 
for failure, which -- as you can see from the way I abort with a user 
message -- is a critical failure.

> It may be ok to continue but you might want to feed back that something 
> wasn't right, so the user can do something about it.  For example, I 
> have a script that logs to a file and if anything goes wrong cron will 
> mail me the file.  Even if I wrap your script so cron doesn't mail me 
> every time it runs, I can't tell whether I care about the output or not.

You can see where I redirect output vs. where I do not.  If something 
really does fail -- that is critical to the success of the backup 
operation -- then the user will get a message from cron that night.  If 
the failure can be muted because the net result is not a critical 
failure, then I mute it.  In other cases, I mute out of necessity.  For 
example:

$TAR_PATH -cf "$workspace_tar_fqn" -T "$BACKUP_LIST_FILE" 2>/dev/null

This is an "undesired, but necessary mute" because tar outputs something 
like "stripping leading slashes in file names..." when used with the -T 
option.  This message is entirely harmless and should not be reported in 
the context I'm using it.  Frankly, I'd rather trap whether tar fails 
due to insufficient drive space, or something equally critical.  Because 
I'm redirecting the stderr output, I can't test on the command level -- 
the behavior is inconsistent in my experience.  Put all this together, 
and I had to error-check the way you see; by testing the expected output 
of the command rather than the command's exit status.

> Your comments in main() could use improving.  "Perform the backup" isnt' 
> nearly as helpful as stating what to do if there are problems.  In the 
> places you've decided to continue, why is that the right thing to do?

Answered above.  As for the comments, I don't understand the complaint. 
   I'm documenting almost at the per-line level.  "Perform the backup" 
immediately precedes the tar command (making it an obvious remark), 
which is followed by (after the services are restored) the 
error-checking code for that tar operation -- in the else condition, you 
see the comment, "The backup tarball failed."  There is no more 
information that I can express to a maintenance programmer without being 
overly redundant.  :)

> Dave

Thanks for the feedback!  I truly appreciate and good with any bad. 
I've been programming for over 20 years, but this is my very first 
"major" shell script.  I realize that I have room to grow here.  :)

-- 
William Kimball, Jr.
http://www.kimballstuff.com/
"Programming is an art-form that fights back!" (Unknown)
_______________________________________________
CLUE-tech mailing list
CLUE-tech at cluedenver.org
http://cluedenver.org/mailman/listinfo/clue-tech



More information about the clue-tech mailing list