The 2002 Perl Advent Calendar
[about] | [archives] | [contact] | [home]

On the 2nd day of Advent my True Language brought to me..
IO::AtomicFile

IO::AtomicFile is one of those modules that has saved my skin more times that I care to mention. It's a simple module that deals with the situation where you're overwriting an existing file, and you want to preserve the existing file right up until the moment you're done writing the new file and you're sure it's a valid replacement.

To do this the module automates the process of writing to a temp file and then renaming that file to the destination. This means that any program trying to read the existing file will get the old one until the new file is completely done.

The other big advantage in this technique is that at any point you can abandon the current file you're writing and the old one is still intact. Because of this IO::AtomicFile allows you to write defensive code that acts well when it encounters errors.

IO::AtomicFile is really easy to use - you just use it exactly like you'd use IO::File. But to explain that, I'd better explain how IO::File works, just incase you haven't used it before.

A Recap of File Handling in Perl

This is a quick recap of file handling in Perl. Feel free to skip to the next section if this is all familiar to you.

An example program that creates a web page full of random numbers:

  # open the filehandle to index.html
  open FH, ">", "index.html"
    or die "Can't open '.index.html': $!";
  # print the web page
  print FH webpage();
  # close the filehandle.  Will happen automatically at the
  # end of the script if we don't do it ourselves
  close FH;
  # create a page of random numbers
  sub webpage
  {
    my $output = q{
      <html>
      <head><title>Today's Random Numbers</title></head>
      <body>};
    # create the random numbers 
    foreach my $count (1..100)
    {
     $output .= "Random number $count: " .
                int(rand(10000)) . "<br />\n";
    }
 
    $output .= "</body></html>";
    return $output;
  }

Since Perl 5.6.0 we've had the ability to rewrite the above file operations to use a scalar.

  # open the filehandle to index.html
  open my $fh, ">", "index.html"
    or die "Can't open 'index.html': $!";
  # print the webpage
  print {$fh} webpage();
  # close the file.  If we don't do this then it will be closed
  # automatically when $fh goes out of scope. 
  close $fh;

This has the advantage that $fh is now just a normal scalar and you can pass it around just like any other variable and that there's a lot less chance of introducing some weird scoping bugs. There's another possibility though - one that works on even older perls than 5.6.0: The IO::File module.

  # open the file handle to index.html
  my $fh = IO::File->new("index.html", ">")
	or die "Can't open 'index.html': $!"
  # print the address to the file handle
  print {$fh} webpage();
  # close the file.  If we don't do this then it will be closed
  # automatically when $fh goes out of scope. 
  close $fh;

Note that the order of the arguments is different to the open command.

Using IO::AtomicFile

Now the question is, what happens if someone tries to access the webpage at the same time as you're updating it? They could quite possibly (assuming that you're writing quite slowly) read the file as you're creating it and get only half a file. This is where IO::AtomicFile comes in. Simply by replacing the IO::File with IO::AtomicFile we get atomic file creation.

  # open the file handle to index.html
  my $fh = IO::AtomicFile->new("index.html", ">")
	or die "Can't open 'index.html': $!";
  # print the address to the file handle
  print {$fh} webpage();
  # close the file.  If we don't do this then it will be closed
  # automatically when $fh goes out of scope. 
  close $fh;

What actually happens is that a file index.html.TMP is created and the output is written there. This file is then renamed to index.html - and this is an 'atomic operation' meaning that it happens, to all intents and purposes, instantaneously. One instant the old file is there, the next instant the new one is in place.

This rename happens when you close the file - both with the explicit close listed above and in the situation where you let $fh go out of scope and it's closed automatically. In other words, you don't normally need to worry about it - it happens transparently.

Disaster recovery

The best thing about writing to a temporary file rather than directly to the 'live' file is that if at any time anything goes wrong, we can simply back out and give up. This is the actual code that I use to create the Advent Calendar pages.

    # open an atomic file.  This creates a temp file and means that we
    # can both abandon changes made, and that the live version will
    # be replaced suddenly.
    my $output_fh = IO::AtomicFile->new(catfile($dir,$file),">")
      or die "Can't open file for writing: $!";
    # try writing the template, and unless it's okay...
    unless ($template->process(catfile(TEMPLATE_DIR,$file),{},
	  		       $output_fh))
    {
      # eeek! a problem, okay, don't write that to the real file
      # whatever we do, delete the temp file
      $output_fh->delete;
      # die
      die "Problem with template: " . $template->error()
    }

So, as you can see, we check if the $template->process call had any errors with the unless (it'll return undef if it did) and if it did we abandon the file we've been working on by calling the delete method on the filehandle.

  • perlopentut
  • IO::File