Perl Advent Calendar 2007-12-17

Subclassing all the way!

by David Westbrook

Calendar policy forbidding a module author to review his own module, has not breached here. Pod::Advent was created expressly for this review and to aid in the article authoring process.
— The Management

Advent Calendar entry submissions have specific formatting guidelines, which should be adhered to by contributing authors to ensure uninformity, otherwise the editors must undertake time-consuming revisions. Using Pod::Simple1, with considerable guidance from the accompanying documentation in Pod::Simple::Subclassing, a simple extended dialect of POD can be defined to aid authors.2

The first items we want to address and make easy are the use of <tt> for file, module, and program names, the use of perltidy -nnn -html for code samples, and the whole header section, including title, author, and css.

The easiest is the POD F<> tag, where we use the _handle_element_start() and _handle_element_end() methods to simply insert <tt> and </tt> tags, respectively, for the $element_name eq 'F' case (lines 43-44, 88-89). Implementing the I<> and B<> codes follows the exact same model (lines 45-46, 90-91).

Now, module names are a step more, because we want them to be hyperlinked to their CPAN search results. For this, we define our code custom POD code of M<> with the accept_codes() method (line 21). We have this behave the same as F<>, with the addition of a case in the _handle_text() method that changes the string from just the text to a HTML hyperlink (lines 125-126).

Any verbatim (indented) text in POD we want to treat as code, and have automatically follow the perltidy guideline. For this, we have a case in the _handle_text() method for $element_name eq 'Verbatim' that passes the text through perltidy with the -html -pre options (lines 99-107).

The tricky ones are setting the title and author fields. This is because they can't simply be written out in the order they are found in the POD source; e.g. the title appears twice -- once in the <title> tag and once in a <h1> tag, and both of those, and the author name, comprise the whole HTML header (lines 65-76) along with some static markup such as the <html>, <head>, <body>, and <style> tags.

So the approach is to use the accept_targets_as_text() method (line 22) to declare several =for advent_KEY VALUE tags to use for setting the various pieces of information. We note via a $section package variable whenever we come upon one of these in the _handle_element_start() method (lines 47-48). Then the next _handle_text() run for $element_name eq 'Para' will use the text to set $data{$section} instead of appending it to the output (lines 118-120). So this will populate a data hash as the source is processed.

What then ties everything together and does the actual output is the $element_name eq 'Document' case in the _handle_element_end() method (lines 58-78).

  use mod17; # a.k.a. Pod::Advent
  my $advent = mod17->new;
  $advent->parse_file( \*STDIN );

mod17.pm

   1 package mod17;
   2 
   3 use strict;
   4 use warnings;
   5 use base qw(Pod::Simple);
   6 use Perl::Tidy;
   7 
   8 our @mode;
   9 our $section = '';
  10 our %data = (
  11   title => undef,
  12   author => undef,
  13   year => (localtime)[5]+1900,
  14   day => 0,
  15   body => '',
  16 );
  17 
  18 sub new {
  19   my $self = shift;
  20   $self = $self->SUPER::new(@_);
  21   $self->accept_codes( qw/A M N/ );
  22   $self->accept_targets_as_text( qw/advent_title advent_author advent_year advent_day/ );
  23   $self->accept_targets( qw/eds/ );
  24   $self->accept_directive_as_data('sourcedcode');
  25   return $self;
  26 }
  27 
  28 sub add {
  29   my $self = shift;
  30   $data{body} .= $_[0];
  31 }
  32 
  33 sub br { shift->add("\n") }
  34 
  35 sub _handle_element_start {
  36   my($parser, $element_name, $attr_hash_r) = @_;
  37   push @mode, $element_name;
  38   if( $element_name eq 'Document' ){
  39   }elsif( $element_name =~ /^head([1-4])$/ ){
  40     $parser->add("<h$1>");
  41   }elsif( $element_name eq 'Para' ){
  42     $parser->add('<p>');
  43   }elsif( $element_name =~ /^(M|F|C)$/ ){
  44     $parser->add('<tt>');
  45   }elsif( $element_name =~ /^(I|B)$/ ){
  46     $parser->add("<$1>");
  47   }elsif( $element_name eq 'for' && $attr_hash_r->{target} =~ /^advent_(\w+)$/ ){
  48     $section = $1;
  49   }elsif( $element_name eq 'for' && $attr_hash_r->{target} eq 'eds' ){
  50     $mode[-1] = $attr_hash_r->{target};
  51     $parser->add('<blockquote style="padding: 1em; border: 2px ridge black; background-color:#eee">');
  52   }
  53 }
  54 
  55 sub _handle_element_end {
  56   my($parser, $element_name) = @_;
  57   my $mode = pop @mode;
  58   if( $element_name eq 'Document' ){
  59     $parser->br;
  60     $parser->add('</body>');
  61     $parser->br;
  62     $parser->add('</html>');
  63     $parser->br;
  64 
  65     printf <<'EOF', @data{qw/year title year day title author/};
  66 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
  67    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  68 <html xmlns="http://www.w3.org/1999/xhtml">
  69 <head>
  70 <title>%d Perl Advent Calendar: %s</title>
  71 <link rel="stylesheet" href="../style.css" type="text/css" /></head>
  72 <body>
  73 <h1><a href="../">Perl Advent Calendar %d-12</a>-%02d</h1>
  74 <h2 align="center">%s</h2>
  75 <h3 align="center">by %s</h3>
  76 EOF
  77     print $data{body};
  78 </html>
  79   }elsif( $element_name =~ /^head([1-4])$/ ){
  80     $parser->add("</h$1>");
  81     $parser->br;
  82   }elsif( $element_name eq 'Para' ){
  83     $parser->add('</p>');
  84     $parser->br;
  85   }elsif( $element_name eq 'for' && $mode eq 'eds' ){
  86     $parser->add('</blockquote>');
  87     $parser->br;
  88   }elsif( $element_name =~ /^(M|F|C)$/ ){
  89     $parser->add('</tt>');
  90   }elsif( $element_name =~ /^(I|B)$/ ){
  91     $parser->add("</$1>");
  92   }
  93 }
  94 
  95 sub _handle_text {
  96   my($parser, $text) = @_;
  97   my $mode = $mode[-1];
  98   my $out = '';
  99   if( $mode eq 'Verbatim' || $mode eq 'C' ){
 100     my $s;
 101     Perl::Tidy::perltidy(
 102         source            => \$text,
 103         destination       => \$s,
 104         argv              => [qw/-html -pre/],
 105     );
 106     $s =~ s#^<pre>\s*(.*?)\s*</pre>$#$1#si if $mode eq 'C';
 107     $out .= $s;
 108   }elsif( $mode eq 'sourcedcode' ){
 109     die "bad filename $text " unless -r $text;
 110     $out .= sprintf '<a name="%s"></a><h2><a href="%s">%s</a></h2>', ($text)x3;
 111     my $s;
 112     Perl::Tidy::perltidy(
 113         source            => $text,
 114         destination       => \$s,
 115         argv              => [qw/-html -pre -nnn/],
 116     );
 117     $out .= $s;
 118   }elsif( $mode eq 'Para' && $section ){
 119     $data{$section} = $text;
 120     $section = '';
 121   }elsif( $mode eq 'A' ){
 122     my ($href, $text) = split /\|/, $text, 2;
 123     $text = $href unless defined $text;
 124     $parser->add( sprintf('<tt><a href="%s">%s</a></tt>',$href,$text) );
 125   }elsif( $mode eq 'M' ){
 126     $parser->add( sprintf('<a href="http://search.cpan.org/search?query=%s">%s</a>',$text,$text) );
 127   }elsif( $mode eq 'N' ){
 128     $out .= sprintf '<sup><a href="#%s">%s</a></sup>', $text, $text;
 129   }else{
 130     $out .= $text;
 131   }
 132   $parser->add( $out, undef );
 133 }
 134 1;

1. As noted previously, we avoid covering modules in the Phalanx 100, but scratching an itch and a non-obvious usage is doubly magic. — The Management

2. Examples of submissions can be seen for 1/.pod, 8/.pod, 16/.pod, and today .pod.