Perl Advent Calendar 2006-12-13

The Nutcracker

by Adriano Ferreira

When I started programming, I didn't do it in what I consider today a great style — the Unix way. IDEs for Pascal and C running under DOS were my gentle first tools. But I survived. Much later I met Perl and learned a lot about Unix, its utilities and libraries.

Among Unix tools, shell scripting always warrants mention as powerful yet simple and fast ability to glue to put together utilities for a specific task. But shell scripts need not be simple, in which case they are not fast (at least) for me to write. Larry said once "It is easier to port a shell than a shell script." Worse still, they don't even look like real languages. Expressions like:

if [ -e nuts.html ] ; then echo ok ; else echo bah ; fi

give me the willies. The inability to tweak formatting is annoying, if I wanted syntactically significant whitespace I'd use Python.

Enter Shell, which allows you to transparently invoke (via a shell) any program as a subroutine call. When I realized its usefulness, every shell script I needed to write with more than a sequence of commands were automatically promoted to a Perl script use-ing Shell.

The module has a deceptively simple interface. Doing:

use Shell qw(tar);

installs a stub in the calling package to spawn tar with any arguments you pass it.

Another way to use Shell, which is not recommended, results from its initial design by Larry as a showcase for Perl 5.0.0's interprocess communication facilities and the new AUTOLOAD feature. Through AUTOLOAD, every call to an unknown sub name will cause perl to try to find and execute a program of the same name. But as soon as the box was opened, Larry Wall realized the dangers of this implementation.

use Shell; # Attention! AUTOLOADing zone

sub foo { 
  # do something fooy
}
fool; # typo will make it pass &foo to the shell

Of course, the inability to detect typos is a "feature" inherent in any use of AUTOLOAD, but what if you meant to call rn and instead called rm?

Finally, if you're lOOpy, you may prefer the following style of invocation that also avoids polluting the namespace of the calling package.

use Shell (); #Note the use of a null import list

my $sh = Shell->new;
print $sh->ls("-lh", $dir);

Despite being a toy for demonstration, Shell is acceptable to use it in some applications: building quick scripts which may be improved later, by replacing the call to external programs with (faster) Perl equivalents, or which may remain as they are if they are good enough for their purposes.

CAVEATS

  1. Scripts written with Shell may not be portable due to the availability of the invoked utilities and some shell idiosyncracies. However, as they are written in Perl, there are no issues about language incompatibilities as it happens with slightly differing shells e.g; csh vs. tcsh.1
  2. The module could be made more useful with further fine tuning. For instance, it would be nice to have better handling of return codes.
  3. Beware of clashes between the names of core Perl functions and external programs. While trying to import grep is futile, you could use fgrep or egrep if your system has them.
  4. Because Shell knows nothing about the programs you are executing it must allocate a buffer for each command invoked, just in case you need access to the output. This is akin to the minor sin of using qx{} even when you mean system().
  5. Pipes, redirections and background processes can be difficult or impossible to express and manage with this module. Fortunately Shell::Autobox will help you tackle the former.

OUR EXAMPLE

The example attached to this article shows how to do a hot backup of a Subversion repository, archive and compress it into a .tar.bz2 file and send it via e-mail with mutt. There is also some logic to avoid doing the backup more than once a day. The shell script version is left as an exercise to the reader.

This script is as portable as the availability of the external programs used, and the consistency of their implementation: svnadmin, a tar which supports the -j option to pipe the result through bzip2, mutt, du (with -h for human readable sizes) and touch. Portability is also affected by any cleverly evil things you do with the shell syntax of the arguments. Here the only "trick" used is input redirection when calling mutt (which is a very common shell feature). It should run happily on Win32, Cygwin, Linux, FreeBSD and others if you have the necessary requirements at your system.

Minor changes to the same script can be done to adjust it to other tasks. Each use case has its particular characteristics, like using svnadmin hotcopy for doing a hot backup over a live Subversion repository (which I learned here). But the rationale is the same: with tiny modifications of the example, you could easily do a Firebird database backup, invoking the gbak utility, a simple backup of some directory tree or any number of other things.

mod13.pl


   1 #!/usr/bin/perl
   2 
   3 use strict;
   4 use warnings;
   5 
   6 use Shell qw(svnadmin tar du mutt touch);
   7 use File::Slurp qw(read_file write_file);
   8 use File::Path qw(rmtree);
   9 
  10 # the repository location
  11 my $ROOT = '/usr/share/svnroot/'; 
  12 # where I send the backup to
  13 my $EMAIL = 'backup-sink@example.com';
  14 # control file: the last backup time
  15 my $CTL = glob '~/bk-time'; 
  16 
  17 my $LAST_MARK = -e $CTL ? read_file($CTL) : 0;
  18 my $PERIOD = 24*60*60; # 24 hours 
  19 my $NOW = time;
  20 
  21 if ($NOW>$LAST_MARK+$PERIOD) { # more than $PERIOD later
  22   my $TMP = 'svnbak';
  23   # backing up the repository
  24   svnadmin("hotcopy", $ROOT, $TMP);
  25   # archiving/compressing
  26   my $BAK = "$TMP.tar.bz2";
  27   tar("cfj", $BAK, $TMP);
  28 
  29   my $SUBJECT = "[ATTIC] backup of $ROOT";
  30   my $BODY = sprintf <<BODY, scalar localtime, -s $BAK, du("-hs", $ROOT);
  31 $ROOT: backup at %s
  32   %d Bytes - $BAK
  33   %s       (uncompressed)
  34 BODY
  35   write_file('body.txt', $BODY);
  36   mutt("-s", $SUBJECT, "-a", $BAK, $EMAIL, '<', 'body.txt'); # sending mail
  37 
  38   rmtree($TMP);
  39   unlink $BAK or warn "could not rm '$BAK': $!\n"; 
  40   unlink 'body.txt' or warn "could not rm 'body.txt': $!\n"; 
  41 
  42   my $MARK = int($NOW / $PERIOD) * $PERIOD;
  43   write_file($CTL, $MARK);
  44 
  45 } else {
  46   # don't do anything but
  47   touch($CTL); # update timestamp
  48 }
1. Yeah yeah…