Perl Advent Calendar 2006-12-03

Trimming the Treemap (with Data)

by Bill Ricker

Treemaps have been around for over a decade. Ben Schneiderman, one of the great user-interface gurus, developed the first one to manage the disk space on his UMD server, and documents their evolution. They've since been adapted to network bandwidths, internet community mapping, stock markets (a Treemap combined with a Heatmap to show gain/loss), and contest scores, among other applications. Many programmers know there are commercial and free Java implementations, but may be unaware that CPAN has a module, Treemap, which implements several of the top variant algorithms in Perl, and uses the labeling format of the Internet community map. The following Treemap (created with the module's example Dir_Squarified_Imager) portrays my /usr/lib/perl/ file tree graphically, and yields a few (now) fairly obvious surprises that would be hard to find with ls.

$ perl examples/Dir_Squarified_Imager.t.pl lib_perl.png /usr/lib/perl5/ Usage tree-map of /usr/lib/perl

We can easily see that:

This is great for finding duplicate and surprisingly big packages and (when run via sudo) the biggest disk abusers. But can be used for so much more, and with Perl it's easy to hook it up to any datasource you like.

I happen to provide tehnical support for a .ORG website that produces some interesting numerical data ever year, so I decided to try using Treemap. It wasn't quite as easy as it looks, but I had some fun and made some nifty graphics. Some difficulties I encountered include:

Using my modified canned example (below) we process the data like so:

$ for pn in ~/fd/FD_2006_*.xml
> do  fn=${pn##*/}
>  perl -I ../Treemap-0.2/  mod3.pl --padding=0 --spacing=1 \
>         --fontdir=/usr/share/fonts/TTF --font=Vera $pn ${fn%%.*}.png
>  echo $fn ${fn%%.*}.png done
>  done
 Loading /home/wdr/fd/FD_2006_ClassNatl.xml...
 FD_2006_ClassNatl.xml FD_2006_ClassNatl.png done

 Loading /home/wdr/fd/FD_2006_ClassSect.xml...
 FD_2006_ClassSect.xml FD_2006_ClassSect.png done

 Loading /home/wdr/fd/FD_2006_DivClass.xml...
 FD_2006_DivClass.xml FD_2006_DivClass.png done

 Loading /home/wdr/fd/FD_2006_Sections.xml...
 FD_2006_Sections.xml FD_2006_Sections.png done

 Loading /home/wdr/fd/FD_2006__EMA.xml...
 FD_2006__EMA.xml FD_2006__EMA.png done
Eastern MA Field Day Contacts Chart

About the Data

In case you're wondering, this is the scoring results from Amateur Radio Field Day, when ham radio stations around the country, many operating in the field, try to contact each other for fun, publicity, and points. The first image (inline) is the "zoom in" on just the Eastern Massachusetts Section (EMA). Since this one is small, it's XML dataset FD_2006__EMA.xml is also included.

The later (linked only) images cover the full US & even some "DX" (foreign, or Distance=DX) stations. The squares for the EMA stations in the first chart are colored blue (cyan); and I've used Yellow for other New England Division sections. Class A are Field Stations in public places on emergency power, with B Battery, C Car, D Domestic, E Emergency Power (Home), F Fire/CD/Emergency-Op-Center. The number before the class letter indicates the number of active transmitters, so 2A means a club in a public place with 2 transmitters.

Oh, and the funny letters like W1NVT are the callsigns of the club stations.

I also ran a mid-range zoom, restricting to just NE Division, in which I didn't bother to color the non-blue squares -- it consists only of the squares yellow and blue in the national charts. FD_2006_NED.png

The site for which I prepared of these charts has a fuller explanation of the charts and of the whole Field Day, and links to the parent organization for amateur radio concepts.

mod3.pl Generic TreeMap Script, mk 1

SYNOPSIS

$0 [args] [infile] [outfile]

Options:

-help brief help
-ma full doc
-verbose any trace capbabilites in script, maybe none
-padding=5 interior space colored with parent node color; if 0, parent node colors vanish.
-spacing=5 exterior space colored with background color; if 0, squares split only by bkgcolor line.
-fontfile=../ImUgly.ttf A font to use, can be used instead of next two or default
-font=arial name of font to use, use with ...
-fontdir=/usr/share/fonts/corefonts  where to find fonts (with above)
-fontcolor='#RRGGBB' Font color in pseudo-html rgb numeric string

COPYRIGHT

Copyright Bill Ricker 2006, based on Treemap examples. This program is free software, to be used under the same terms are Perl itself.


   1 #!/usr/bin/perl -l
   2
   3 use strict;
   4 use warnings;
   5
   6 use lib( ".." );
   7 use Treemap::Squarified;
   8 use Treemap::Input::XML;
   9 use Treemap::Output::Imager;
  10
  11 use Getopt::Long;
  12 use Pod::Usage;
  13 my ($verbose,$help, $man);
  14
  15 my ($padding, $spacing)=(5,5);
  16 my ($fontdir, $font,$font_color)=(undef,'ImUgly','#000000'); ## 00=>Black
  17 my $fontfile;
  18 my $outfile;
  19 my $outtype="png";
  20 my $infile;
  21 my ($width,$height)=(800,600); # Fullscreen on small projector good default?
  22
  23 my $result = GetOptions (
  24           "padding=i" => \$padding,    # numeric
  25           "spacing=i" => \$spacing,    # numeric
  26         # "length=i"  => \$length,     # numeric
  27         # "file=s"    => \$data,       # string
  28           "infile=s"  => \$infile,     # string
  29           "outfile=s" => \$outfile,    # string
  30           "outtype=s" => \$outfile,    # string
  31           "fontdir=s" => \$fontdir,    # string
  32           "font=s"    => \$font,       # string
  33           "fontfile=s"=> \$fontfile,   # string
  34           "fontcolor=s"=>\$font_color, # string
  36
  37           "verbose|v|V"=>\$verbose,    # flag
  38             "help|h|?" =>\$help,
  39             "man"      =>\$man,
  40         ) or pod2usage(2);
  41            pod2usage(1) if $help;
  42            pod2usage(-exitstatus => 0, -verbose => 2) if $man;
  43
  44 ### TBD - could add a couple more options ...
  45 # BORDER_COLOUR MIN_FONT_SIZE TEXT_DEBUG DEBUG
  46 ## Will also need to pass args to a CSV input parser ...
  47 ##
  48
  49
  50 # Default cascades
  51 $infile ||= shift || "XML.xml";
  52 print "infile=>$infile" if $verbose;
  53
  54 $outfile ||= shift || "$infile.$outtype";
  55 print "outfile=>$outfile" if $verbose;
  56
  57 if ($font && ! $fontfile && ! $fontdir)
  58 {
  59    my @fonts = grep { -f $_} map {"$_/$font.ttf"} (glob('~/.fonts'),glob('~/.fonts/*'),glob('/usr/share/fonts/*'));
  60    $fontfile = pop @fonts || (warn "No matching font in .font or /usr/share/fonts")&&'';
  61 }
  62
  63 $fontfile ||= "$fontdir/$font.ttf" if $font && $fontdir;
  64 $fontfile ||= '../ImUgly.ttf';
  65 print "font=>$fontfile" if $verbose;
  66
  67 print "$width x $height ($padding, $spacing)" if $verbose;
  68
  69 ## Input
  70 #
  71 my $input = new Treemap::Input::XML; ## @TBD -- suggested autoloading Input type, output type?
  72 print "Loading $infile...\n";
  73 $input->load( $infile );
  74
  75 ## Output
  76 #
  77 my $output = new Treemap::Output::Imager( WIDTH=>$width, HEIGHT=>$height,
  78         FONT_FILE=>$fontfile,
  79         FONT_COLOUR=>$font_color,
  80         );
  81
  82 ## Splice them together with Layout object
  83 #
  84 my $treemap = new Treemap::Squarified( INPUT=>$input, OUTPUT=>$output,
  85         PADDING=>$padding,
  86         SPACING=>$spacing,
  87         );
  88
  89 ## Do it
  90 #
  91 $treemap->map();
  92 $output->save($outfile);
1. Shortly before going to press, I received word from the module author that another user had submitted a patch for DBI support. Until it propogates to CPAN you can nab the 0.04 treemap bundle with Treemap::Input:DBI from here