Perl 2002 Advent Calendar: Object::Realize::Later

Have you ever written an object that has to do something really quite intensive whenever it's created - process some data, load something across a network - and thought to yourself "do I really need to do this? Will this actually be used? Or will the program just check some simple attribute and discard the object, and I'll have wasted all that processing power?

Often you need to create objects that will never be used. For example, imagine the situation where you have a routine that returns a array of objects that represent all the image files in a directory. Now, actually loading the images from these files is quite a slow task, relatively speaking, and we'd rather not do it if we didn't have to. If we load each image as we create the object it'll take a few seconds on a big directory just to return the array. The trouble is that we have no idea which images the person's just going to be just checking the filename of, and which one he's going to need to get at the actual image data for.

When it comes down to it it's often a lot easier to code your objects to initialise themselves properly at load time than having each and every method check if some data has loaded already and load it if it hasn't. And as easier often translates as "faster to code" that's often what people do.

What we really need is some kind of mechanism that allows us to automatically do tasks the first time someone tries to do something complicated with the object. A system that allows us to transform our simple object into a fully realised one that has all it's data in place. What we need is Object::Realize::Later

[Read the documentation for Object::Realize::Later on search.cpan.org]

So, for once my foreword contains a useful example that I'll hope to illustrate further in this section. Let's write a really simple example of an object that represents an image file.

  package MyImage;

  # turn on perl's safety features
  use strict;
  use warnings;

  # load the GD image handling library
  use GD;

  # constructor, takes an file name as an argument 
  # and loads the image
  sub new
  {
    my $class = shift;     # the class name
    my $filename = shift;  # the image filename

    # create the object
    my $self = bless {}, $class;

    # load the image
    $self->{image} = GD::Image->new($filename)
      or die "Cannot load image";

    # store the filename
    $self->{filename} = $filename;

    # return the object
    return $self;
  }

  # return the image filename
  sub filename
  {
    my $self = shift;
    return $self->{filename};
  }

  # return the size of the image
  sub size 
  {
    my $self = shift;
    return ($self->{image}->getBounds());
  }

  # the image as a png
  sub png
  {
   my $self = shift;
   return $self->{image}->png
  }

  # the image as a jpeg
  sub jpeg
  {
   my $self = shift;
   return $self->{image}->jpeg
  }

1;

The way this is coded, it loads the image data from disk as soon as the object is created. This is wasteful if we're only interested in retrieving the filename at a later date. The solution is to create a 'proxy' class that only will store and return the filename. Any other calls on it will cause it to automatically upgrade itself to a real MyImage object.

  package MyImage::Proxy;

  use strict;
  use warnings;

  # when any method that isn't defined for this object is called
  # 'realise' it by calling 'load' to turn it into a 'MyImage'
  use Object::Realize::Later
            becomes => 'MyImage',   # class it'll become
            realize => 'load';      # method that'll do it

  # the constructor.  It's exactly the same as before,
  # just without the image being loaded.
  sub new
  {
    my $class = shift;
    my $self = bless {}, $class;

    # store the filename
    $self->{filename} = shift;

    return $self;
  }

  # define the filename method.  As this method is defined,
  # calling it won't cause the object to be realised.
  sub filename
  {
    my $self = shift;
    return $self->{filename};
  }

  # The method that defines how we should turn this 
  # "MyImage::Proxy" into a "MyImage"
  sub load
  {
    my $self = shift;
    
    # change the class by reblessing it into the new class
    bless $self, "MyImage";

    # load the data that's missing
    $self->{image} = GD::Image->new($self->{filename})
        or die "Cannot load image";

    # return it
    return $self;
  }

1;

So, let's run though a typical example of using this code. Let's create a new image object and check what class it is

  # create an image
  my $image = MyImage::Proxy->new("camel.jpg");
  print $image->filename, " = ", ref($image), "\n";
  
This, as we expect, prints out

  camel.jpg = MyImage::Proxy

Now let's try saving the image as a PNG file.

  use IO::File;
  my $fh = IO::File->new("camel.png",">")
    or die "Can't open camel.png: $!";
  binmode $fh;
  print {$fh} $image->png;

Which works, even though MyImage::Proxy doesn't have a png method. As soon as an unknown method was called on $image the load method was called. This converts the object into an instance of MyImage by reblessing it into that class and loading the missing data. Now when png is automatically called again on the object by Object::Realize::Later it will be successful - as a now the object is a MyImage it does have a png method. We can confirm all of this by printing out the class again:

  print $image->filename, " = ", ref($image), "\n";

Which this time now prints out the new class of

  camel.jpg = MyImage