Perl Advent Calendar 2012

So long until next year!

2012-12-25T00:00:00Z

The Advent calendar has ended. It ended yesterday, so behind this door you're not finding a rich, chocolately article about Perl, but instead a bland note from the editor. Sorry!

As always, the Perl Advent Calendar was a group effort, and I'd like to thank everyone who contributed this year: Arthur Axel "fREW" Schmidt, Breno G. de Oliveira, Chris Prather, Dave Mitchell, David Golden, Jeffrey Ryan Thalhammer, Jesse Luehrs, Jonathan Swartz, Karen Etheridge, Mark Fowler, Philippe Bruhat (BooK), Rafaël Garcia-Suarez, Sawyer X, Shawn M Moore, Toby Inkster, and Yanick Champoux.

If you'd like to contribute an article for next year, you've got just about 365 days to do it! This year — for the first time in years, if not ever — we had more submissions than we could publish, so we might already have fewer than 24 slots open for 2013. Wow!

If you want to help with the site or other things in the meantime, you can <s>join our mailing list</a></s> check the perladvent/Perl-Advent GitHub repo where we'll be talking about work that needs to get done on things like the FAQ, the site generator, and all that sort of thing. You can find the site's contents on GitHub, which should contain the 2012 articles by the time you see this. It's not exactly how it should be, but it's there.

More importantly, you can find our wish list for fixes and features. Help with these and you will earn fame forever (at least in the git logs)!

So until I'm back at Christmas 2013 – if I don't come to my senses and quit this job – have a merry Christmas and an excellent new year!

Have REST-ful Holidays

2012-12-24T00:00:00Z

Have REST-ful Holidays

Your boss comes to you the day after Thanksgiving vacation (or if you're in Scotland, the day after St. Andrew's Bank Holiday; if you're not in Scotland or the US, adjust accordingly):

    Boss: We need a Web API for the Flibber data. I want it to be REST-ful.

    You:  REST-ful API? Where did you hear about REST?

    Boss: I&#39;ve started reading /r/programming. Everyone is making REST APIs
          now. We need one.

    You: *sigh* I&#39;ll get right on that.

So you need to build a REST-ful Web API in time for the Holidays. You may not even know what REST is, beyond some buzzword your boss picked up in an internet back-alley.

How to Explain REST to Anyone … even Ryan Tomayko's wife.

On Ryan Tomayko blog he has a dialog with his wife where he explains REST and why it's important. We really don't have time to go into all of the details so you should read it, but I'll try to cover the most important bits.

    Ryan: [...] The web is built on an architectural style called REST. REST
          provides a definition of a resource, which is what those things point
          to.

    Wife: A web page is a resource?

    Ryan: Kind of. A web page is a representation of a resource. Resources are
          just concepts. URLs--those things that you type into the browser...

    Wife: I know what a URL is..

    Ryan: Oh, right. Those tell the browser that there&#39;s a concept somewhere. A
          browser can then go ask for a specific representation of the concept.
          Specifically, the browser asks for the web page representation of the
          concept.

Basically the way the world wide web works is that clients request Representations of Resources identified by URLs (or URIs if you're pedantic). Clients and Servers use HTTP to give and return these requests. Most requests are by Browsers and they just want an HTML representation, but more and more clients are requesting non-HTML representations too. Thankfully the Web was designed to handle this, if we just write things in the right style.

REST is the style of writing applications so that they take full advantage of HTTP and the design of the Web. Now you know what REST is. Knowing is half the battle.

HTTP is Hard

This is a diagram of the state machine based on the HTTP protocol. It has 57 states asking 50 different questions about how to process any given HTTP request and generate the right response. That's a lot to keep in your head.

Luckly there are frameworks on CPAN to help out with these. A good one for demonstrating these is Web::Machine by Stevan Little. It is based on the Erlang Webmachine project by Basho (makers of Riak!) that generated the state machine diagram.

Web::Machine is broken into two parts. A Finite State machine that implements the diagram, and a Resource base class that provides sensible defaults that you can override in your own class. Let's just dive in.

A note, while Web::Machine itself works on Perl 5.10.1 or higher, all examples will explicitly be using 5.16.2. Remember if you change the version line to enable strict.

It's a Time Machine!

So let's start with a basic web service. My Car doesn't have a clock in it, so to be properly Web 2.0 compliant, I'll write a JSON service that I can later target with an iOS client that will run from my phone. That won't be overkill at all.

use 5.16.2;
use Web::Machine;

{
&nbsp;&nbsp;&nbsp;&nbsp;package WasteOfTime::Resource;
&nbsp;&nbsp;&nbsp;&nbsp;use strict;
&nbsp;&nbsp;&nbsp;&nbsp;use warnings;

&nbsp;&nbsp;&nbsp;&nbsp;use parent 'Web::Machine::Resource';

&nbsp;&nbsp;&nbsp;&nbsp;use JSON::XS qw(encode_json);

&nbsp;&nbsp;&nbsp;&nbsp;sub content_types_provided { [{ 'application/json' =&gt; 'to_json' }] }

&nbsp;&nbsp;&nbsp;&nbsp;sub to_json { encode_json({ time =&gt; scalar localtime }) }
}

Web::Machine-&gt;new( resource =&gt; 'WasteOfTime::Resource' )-&gt;to_app;

Web::Machine is a toolkit for building Resources. So after the standard boiler plate we start out by defining a resource class. Although Web::Machine was written by the same guy who brough you Moose it actually tries to be minimal about it's dependencies and doesn't sneak Moose in under the covers.

So we create a class WasteOfTime::Resource that will be our Resource class, and we have it inherit from Web::Machine::Resource so that Web::Machine will know it's a Resource and so that the proper defaults are set. We could be done here, and our application would do nothing but throw a 406 NOT ACCEPTABLE. But that's less than useful.

We know we want to provide a JSON API so we override the parent content_types_provided and say we will provide a representation of 'application/json' and that we should use the to_json method to get it.

Then we define the to_json representation. This resource doesn't have any state so we can just build the JSON inline. We use the scalar value of localtime because we want the nice string format not a list of numbers.

Finally once our resource class is built, we create a Web::Machine instance, tell it which resource class to use and then have it provide us a Plack application. If we save all of this in a file (I chose time.psgi) we can run it.

    $ plackup time.psgi
    HTTP::Server::PSGI: Accepting connections at http://0:5000/

Which we can now access using a web client.

    $ curl -v http://0:5000

    * About to connect() to 0 port 5000 (#0)
    *   Trying 127.0.0.1... connected
    * Connected to 0 (127.0.0.1) port 5000 (#0)
    &gt; GET / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: */*
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 200 OK
    &lt; Date: Sun, 09 Dec 2012 02:04:02 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; Content-Length: 35
    &lt; Content-Type: application/json
    &lt;
    * Closing connection #0
    {&quot;time&quot;:&quot;Sat Dec  8 21:04:02 2012&quot;}

And you can see our Representation there at the end. If we try a request that isn't allowed, say for an HTML representation, we will get the appropriate error too.

    $ curl -v http://0:5000 -H&#39;Accept: text/html&#39;

    * About to connect() to 0 port 5000 (#0)
    *   Trying 127.0.0.1... connected
    * Connected to 0 (127.0.0.1) port 5000 (#0)
    &gt; GET / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: text/html
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 406 Not Acceptable
    &lt; Date: Sun, 09 Dec 2012 02:07:47 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; Content-Length: 14
    &lt;
    * Closing connection #0
    Not Acceptable

We get that 406 not acceptable again.

Many Ways to Say the Same Thing

So far we're not doing bad for 20 lines of code, but what if we want that HTML representation too? Actually it's pretty simple. First we add a new content type.

    sub content_types_provided { [
        { &#39;application/json&#39; =&gt; &#39;to_json&#39; },
        { &#39;text/html&#39;        =&gt; &#39;to_html&#39; },
    ] }

We say that 'text/html' will be handled by to_html. Now we just define a to_html method to return our HTML representation.

sub to_html {
&nbsp;&nbsp;&nbsp;&nbsp;join &quot;&quot; =&gt;
&nbsp;&nbsp;&nbsp;&nbsp;'&lt;html&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'&lt;head&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'&lt;title&gt;The Time Now Is:&lt;/title&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'&lt;/head&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'&lt;body&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'&lt;h1&gt;'.localtime.'&lt;/h1&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'&lt;/body&gt;',
&nbsp;&nbsp;&nbsp;&nbsp;'&lt;/html&gt;'
}

Notice that Web::Machine doesn't have any opinion on how you generate HTML. You're free to use whatever template system you want. You're also free to write all of the glue code for that. Web::Machine is pretty bare bones about that, this is why it's called a toolkit and not a framework.

So if we add this code and we issue that last request we can see the change.

    $ curl -v http://0:5000 -H&#39;Accept: text/html&#39;

    * About to connect() to 0 port 5000 (#0)
    *   Trying 0.0.0.0... connected
    * Connected to 0 (0.0.0.0) port 5000 (#0)
    &gt; GET / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: text/html
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 200 OK
    &lt; Date: Sun, 09 Dec 2012 02:26:39 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; Vary: Accept
    &lt; Content-Length: 103
    &lt; Content-Type: text/html
    &lt;
    * Closing connection #0
    &lt;html&gt;&lt;head&gt;&lt;title&gt;The Time Now Is:&lt;/title&gt;&lt;/head&gt;&lt;body&gt;&lt;h1&gt;Sat Dec  8 21:26:39 2012&lt;/h1&gt;&lt;/body&gt;&lt;/html&gt;

The Times They Are A Changing

So we're returning multiple representations, and that's great but what if we want to alter the resource? Let's let ourselves change the timezone. We'll need to use POSX qw(tzset) and add some methods.

use POSIX qw(tzset);

sub allowed_methods { [qw[ GET POST ]] }

sub process_post {
&nbsp;&nbsp;&nbsp;&nbsp;my $self = shift;
&nbsp;&nbsp;&nbsp;&nbsp;my $input = eval { JSON::XS-&gt;new-&gt;decode( $self-&gt;request-&gt;content ); };
&nbsp;&nbsp;&nbsp;&nbsp;$ENV{TZ} = $input-&gt;{timezone};
&nbsp;&nbsp;&nbsp;&nbsp;tzset;
&nbsp;&nbsp;&nbsp;&nbsp;return 1;
}

Changing the allowed_methods lets Web::Machine know we are expecting POST requests as well as GET requests to this resource. Then when we process the post we simply set the appropriate value.

    $ curl -v -X POST http://0:5000 -H &#39;Content-Type: application/json&#39; -d &#39;{&quot;timezone&quot;:&quot;America/Los_Angeles&quot;}&#39;

    * About to connect() to 0 port 5000 (#0)
    *   Trying 127.0.0.1... connected
    * Connected to 0 (127.0.0.1) port 5000 (#0)
    &gt; POST / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: */*
    &gt; Content-Type: application/json
    &gt; Content-Length: 34
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 204 No Content
    &lt; Date: Sun, 09 Dec 2012 02:49:22 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; Vary: Accept
    &lt; Content-Type: application/json
    &lt;
    * Closing connection #0

If we check now, we'll see that the time has changed.

    $ curl -v http://0:5000

    * About to connect() to 0 port 5000 (#0)
    *   Trying 127.0.0.1... connected
    * Connected to 0 (127.0.0.1) port 5000 (#0)
    &gt; GET / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: */*
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 200 OK
    &lt; Date: Sun, 09 Dec 2012 02:46:56 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; Vary: Accept
    &lt; Content-Length: 35
    &lt; Content-Type: application/json
    &lt;
    * Closing connection #0
    {&quot;time&quot;:&quot;Sun Dec  9 02:46:56 2012&quot;}

Since the previous times were America/New_York the new times are the correct 3 hours behind.

[Somethign Witty HERE]

In addition to supporting the standard HTTP methods, Web::Machine helps with much of the rest of the HTTP standard including things like Cache Control headers. To enable most basic cache controls simply provide a couple methods to generate ETag and last modified headers.

use Digest::SHA qw(sha1_hex);
use Web::Machine::Util qw(create_date);

sub generate_etag { sha1_hex(scalar localtime) }

sub last_modified { create_date(scalar localtime) }

We import two new modules here. Digest::SHA helps us just make a unique identifier for our resource. Web::Machine::Util helps us create the appropriate date object that Web::Machine is expecting.

If we run our client against this now we'll see the new cache control headers.

    $ curl -v http://0:5000

    * About to connect() to 0 port 5000 (#0)
    *   Trying 0.0.0.0... connected
    * Connected to 0 (0.0.0.0) port 5000 (#0)
    &gt; GET / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: */*
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 200 OK
    &lt; Date: Sun, 09 Dec 2012 14:50:21 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; ETag: &quot;fa4c7582066e3b42fffd346cfba9714ea66cd645&quot;
    &lt; Vary: Accept
    &lt; Content-Length: 35
    &lt; Content-Type: application/json
    &lt; Last-Modified: Sun, 09 Dec 2012 14:50:21 GMT
    &lt;
    * Closing connection #0
    {&quot;time&quot;:&quot;Sun Dec  9 09:50:21 2012&quot;}

And if we make a request for a resource that should be cached, we get the right response code.

    $ curl -v http://0:5000 -H&#39;If-Modified-Since: Sun, 09, Dec 2012 14:55:21 GMT&#39;

    * About to connect() to 0 port 5000 (#0)
    *   Trying 0.0.0.0... connected
    * Connected to 0 (0.0.0.0) port 5000 (#0)
    &gt; GET / HTTP/1.1
    &gt; User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
    &gt; Host: 0:5000
    &gt; Accept: */*
    &gt; If-Modified-Since: Sun, 09, Dec 2012 14:55:21 GMT
    &gt;
    * HTTP 1.0, assume close after body
    &lt; HTTP/1.0 304 Not Modified
    &lt; Date: Sun, 09 Dec 2012 14:55:11 GMT
    &lt; Server: HTTP::Server::PSGI
    &lt; ETag: &quot;f6da728260ea1563bd14ce999f0246a4817f6fee&quot;
    &lt; Vary: Accept
    &lt; Last-Modified: Sun, 09 Dec 2012 14:55:11 GMT
    &lt;
    * Closing connection #0

In addition to cache controls, Web::Machine provides methods for authentication, request validation, URI validation, charset and encoding variation, and most of the rest of the HTTP spec.

The Downsides

Web::Machine is pretty bare bones. It leaves a lot of opinions beyond HTTP up to the author. This is considered a bonus because these opinions are very much influenced heavily by the environment your application will be deployed in. If you want a framework that provides more pre-built wheels you may want to look at Magpie which is a framework based upon the same principles as Web::Machine but takes a very different approach for it's implementation.

One of the principles of REST is that hypertext is the engine of application state. Because Web::Machine has no opinions on templating, or really representation generation at all, it has no tools for building Hypermedia Documents. I highly recomend looking at the Hypermedia Application Language (HAL) specification for structuring hypermedia documents. It describes serializations in both JSON and XML depending on how old school you want to go.

Currently Web::Machine also doesn't handle an asynchronous environment. To be honest HTTP really doesn't have an asynchronous mode. The closest HTTP has is multi-part responses which are uni-directional streams. An example of this is the Twitter streaming API. There has been talk about adding support for this to Web::Machine but if you're looking for this, or something like Websockets right now, Web::Machine isn't the right choice.

Give and Receive the Right Number of Gifts

2012-12-23T00:00:00Z

Making a List

Santa is already hard at work preparing for this year's Christmas. Little boys and girls across the world are sending him their Christmas lists, which look like this:

  Shawn.txt:
      3 wooden toys
      1 dog clock
      1 hobby horse
      2 glider planes

Santa naturally uses Perl to parse this list, as he has been these last 25 Christmases. Santa's Perl script produces the work orders that his elves use to make all those toys.

#!/usr/bin/env perl
use 5.16.0;
use warnings;
use autodie;

my $kid_name = shift;
my $toy_count = 0;

open my $handle, '&lt;', &quot;$kid_name.txt&quot;;
while (&lt;$handle&gt;) {
&nbsp;&nbsp;&nbsp;&nbsp;my ($quantity, $gift) = /^(\d+) (.+)/;
&nbsp;&nbsp;&nbsp;&nbsp;say &quot;$kid_name would like $gift ($quantity).&quot;;
&nbsp;&nbsp;&nbsp;&nbsp;$toy_count += $quantity;
}

say &quot;Dearest Elf, please make $toy_count gifts for $kid_name.&quot;;

Santa's script simply looks at each line in the child's Christmas list and keeps a tally of how many gifts in total they would like. Here's what Santa sends to his elves on my behalf:

  Shawn would like wooden toys (3).
  Shawn would like dog clock (1).
  Shawn would like hobby horse (1).
  Shawn would like glider planes (2).
  Dearest Elf, please make 7 gifts for Shawn.

All is well and good in Santa's Workshop!

JavaScript

In his copious free time, Santa has been learning this year's hot new thing, node.js. After all, Santa has his future to think about. If the giving-billions-of-presents-to-kids-every-year racket doesn't work out, he knows that he can fall back on his hobby, programming. But Santa knows software engineering jobs generally demand fluency in multiple languages, so to get some practical experience, he'll use node.js to process these Christmas lists. Since Perl is so good at text parsing, Santa will continue using that to process the incoming Christmas lists. But he wants to port the work order builder to JavaScript. And of course, Santa knows to use JSON to share data between the two environments.

First, he changes the Perl script to emit JSON instead of text:

#!/usr/bin/env perl
use 5.16.0;
use warnings;
use autodie;
use JSON;

my $kid_name = shift;
my @xmas_list;

open my $handle, '&lt;', &quot;$kid_name.txt&quot;;
while (&lt;$handle&gt;) {
&nbsp;&nbsp;&nbsp;&nbsp;my ($quantity, $gift) = /^(\d+) (.+)/;
&nbsp;&nbsp;&nbsp;&nbsp;push @xmas_list, {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;gift     =&gt; $gift,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;quantity =&gt; $quantity,
&nbsp;&nbsp;&nbsp;&nbsp;};
}

say to_json({
&nbsp;&nbsp;&nbsp;&nbsp;kid_name  =&gt; $kid_name,
&nbsp;&nbsp;&nbsp;&nbsp;xmas_list =&gt; \@xmas_list,
});

Now that the Perl script is producing JSON, Santa is ready to write the node code to consume the Christmas list. He structured the JavaScript to register callbacks for interesting events: every time there's new data on STDIN, it is concatenated onto the json variable. Then when STDIN is closed, the full JSON is consumed and examined to print out the work order.

#!/usr/bin/env node
var json = '';

process.stdin.resume();

process.stdin.on('data', function(chunk) { json += chunk });

process.stdin.on('end', function() {
&nbsp;&nbsp;&nbsp;&nbsp;var input     = JSON.parse(json),
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;kid_name  = input.kid_name,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;xmas_list = input.xmas_list,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;toy_count = 0;

&nbsp;&nbsp;&nbsp;&nbsp;xmas_list.forEach(function (item) {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log(kid_name + &quot; would like &quot; + item.gift + &quot; (&quot; + item.quantity + &quot;).&quot;);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;toy_count += item.quantity;
&nbsp;&nbsp;&nbsp;&nbsp;});

&nbsp;&nbsp;&nbsp;&nbsp;console.log(&quot;Dearest Elf, please make &quot; + toy_count + &quot; gifts for &quot; + kid_name + &quot;.&quot;);
});

Santa fires off these two scripts and examines the first work order that comes out.

    Shawn would like wooden toys (3).
    Shawn would like dog clock (1).
    Shawn would like hobby horse (1).
    Shawn would like glider planes (2).
    Dearest Elf, please make 03112 gifts for Shawn.

Oh no! 3,112 gifts is way too many for one kid! Santa scratches his beard and tries to figure out where things went wrong. He reads the JavaScript again and again, but it all looks right. He reads the Perl again and again, but that looks right too. And where did that zero come from anyway? Elves are certainly eccentric but even they prefer decimal, not octal, numbers.

Do you, beloved reader, see the problem?

Data Interchange

This problem is driving Santa batty, so he starts shifting a little bit more blame than is deserved.

"Does node.js not even get addition right?"

"Is this abominable computer playing tricks on me?"

"That Larry fella? Coal!!"

Certainly you've felt the same kind of debugging paranoia before, too. Often a divide and conquer approach will get your foot in the door of these kinds of problems. If Santa can decisively conclude that the bug is in either the Perl or the JavaScript, that halves the problem he is flailing at.

We can figure out which language the bug is in by closely examining the JSON passed from Perl to JavaScript. If the JSON is correct, then the bug can't possibly be in Perl. If the JSON is wrong, then the bug is obviously in Perl. So what does that JSON look like?

&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;xmas_list&quot; : [
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;wooden toys&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : &quot;3&quot;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;dog clock&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : &quot;1&quot;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;hobby horse&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : &quot;1&quot;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;glider planes&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : &quot;2&quot;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;],
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;kids_name&quot; : &quot;Shawn&quot;
&nbsp;&nbsp;}

That JSON is wrong! Why is each quantity a string? That makes no sense. And since the JSON is wrong, there must be a bug in the Perl program.

Each quantity is a string ultimately because Perl is very forgiving. If you write $str1 + $str2, Perl knows that you meant addition and not concatenation because you chose the + operator instead of the . operator. Concatenation doesn't even cross Perl's mind! So Perl dutifully complies with the + operator by numifying $str1 and $str2 then adding whatever it pulled out of those two strings.

In Santa's original completely-Perl program, and indeed in most programs, this behavior is very helpful. When Perl deconstructed each line of the Christmas list with a regular expression, it pulled out two substrings: quantity and name. But when Santa summed up the number of gifts, Perl numified each quantity. We didn't need to tell Perl to do that beyond just using addition. This correctly produced the total 7.

However when Santa changed his program to start emitting JSON, there was no more addition to hint to Perl that quantity is actually numeric. Instead, when JSON came to serialize each quantity, it saw that Perl currently thought the value was a string not a number (since it was produced with a regular expression capture group). So JSON produced the strings "3", "1", "1", and "2".

This is problematic because in many other languages that aren't Perl, such as JavaScript, what the + operator will do depends on the types of its operands. number + number means addition and string + string means concatenation. It's not better or worse than how Perl does it; just different. But it means that when the JSON contained strings for quantity, node.js chose concatenation, not addition, during each iteration of toy_count += item.quantity;. This is how Santa accidentally ordered "03112" gifts for me (recall that toy_count was initialized to 0, so that's where that leading 0 came from).

Numification

Now we understand the bug. But what is Santa to do about it? It's already December 23rd, he doesn't have much time here!

The fix is to force Perl to treat the quantity value as numeric. You can do this in several different ways: adding zero, multiplying by one, using int(...), and so on. These operations can only produce numbers, which lets Perl annotate such values as being numeric. That way when JSON comes to serialize quantity, it sees that the value is a number not a string, so it leaves off the quotation marks. Then when node.js parses this JSON, it treats each quantity as a number, not as a string, so + means addition not concatenation, so we should end up with the correct sum.

Let's add zero to $quantity to produce a number.

push @xmas_list, {
&nbsp;&nbsp;&nbsp;&nbsp;gift     =&gt; $gift,
&nbsp;&nbsp;&nbsp;&nbsp;quantity =&gt; $quantity + 0,
};

With this change, the JSON looks like this:

{
&nbsp;&nbsp;&nbsp;&nbsp;&quot;xmas_list&quot; : [
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;wooden toys&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : 3
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;dog clock&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : 1
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;hobby horse&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : 1
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;gift&quot;     : &quot;glider planes&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&quot;quantity&quot; : 2
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;],
&nbsp;&nbsp;&nbsp;&nbsp;&quot;kids_name&quot; : &quot;Shawn&quot;
}

And the toy order looks like this:

  Shawn would like wooden toys (3).
  Shawn would like dog clock (1).
  Shawn would like hobby horse (1).
  Shawn would like glider planes (2).
  In summary, Shawn would like 7 gifts.

Great! Problem solved!

JSON::Types

Santa is very happy that orders are now flowing correctly. But he is concerned about that + 0. It's not particularly obvious what the + 0 is doing, because, practically speaking, when would you ever want to add zero to something? Next year when it comes time to incorporate 2013's cool tech, Santa may have forgotten the reason for the + 0 and outright delete it, or simply drop it when he next refactors the code. He could leave a comment:

push @xmas_list, {
&nbsp;&nbsp;&nbsp;&nbsp;gift     =&gt; $gift,
&nbsp;&nbsp;&nbsp;&nbsp;quantity =&gt; $quantity + 0, # treat quantity as a number
};

But he's understandably still concerned about potential maintenance problems because, let's face it, + 0 is fundamentally strange.

Luckily for Santa, there is a new module called JSON::Types that makes fixing this and other similar problems a treat. JSON::Types provides a subroutine called number which encapsulates the messy bit of convincing Perl to produce a number.

push @xmas_list, {
&nbsp;&nbsp;&nbsp;&nbsp;gift     =&gt; $gift,
&nbsp;&nbsp;&nbsp;&nbsp;quantity =&gt; JSON::Types::number($quantity),
};

The best part is that just by putting JSON::Types::number into the source code, it's a lot more obvious what is really going on. Even a thousand years from now, Santa will be able to understand what that piece of code means and why it must happen. After all, JSON::Types has easily-findable documentation explaining the problem. It's hard to document + 0 in general, and it's hard to search for too. Just seeing the word "JSON" in that line of code might even be enough to jog Santa's memory.

JSON::Types also provides string for turning numbers into strings. You could of course use "$number" to force a string into a number, but that has the same kinds of problems as + 0.

Finally, JSON::Types also provides a bool subroutine for producing the constants true and false that JSON has. In Perl, we get by just fine with using undef or the empty string for false and 1 for true, but in other languages, that simply will not stand. bool($value) will use the same kind of logic as Perl's if to decide whether $value should produce true or false. Without JSON::Types, you'd have to do something silly like $value ? JSON::true : JSON::false.

So, this year, be sure you're using addition, not concatenation, to count how many gifts your loved ones get!

Generate static web sites using your favorite Perl framework

2012-12-22T00:00:00Z

The best of two worlds

Static web sites

Have you noticed the recent trend of static blogging?

The idea behind static blogging is to use a tool to generate the HTML pages that constitute the blog from a set of simple text files, and to publish these generated pages using a basic web server.

Some would argue that a blog without comments is not really a blog. And how do you comment without POST? One way would be to delegate the POSTing to someone else (like Disqus).

Also, static web sites don't have to be frozen. Nothing prevents you from generating the site content regularly (especially if it depends on an external source) or from hooking it to your VCS repository, so that every update to the source triggers a regeneration.

Going back to static web sites, here are a few reasons why people like them:

Speed:

It would be really hard to beat a webserver serving a static file from disk.
Security:

No user input means no SQL injection. If no code is run to produce the response, then no bugs can interfere in the process.

Of course, you're still vulnerable to your webserver's own security issues when it serves static files, but that should be a pretty limited set.
Simplicity:

A static web site is a bunch of files. You can commit them in a VCS and push them to their final destination, or you can use FTP. It's the easiest deployment procedure ever.
Economy:

Generate once, request any time!

Static blogging tools like Jekyll, Pelican, Middleman keep popping up, and new ones are invented almost daily.

(I have myself been using ttree for years, but writing code using Template Toolkit's DSL can be limiting.)

The setup is always the same: take a bunch of files in some format (usually Markdown, reStructuredText or Textile), plus some configuration, and run the tool. The problem with that is always the same: the model fits the original author's needs, and you have to follow their rules. Personalisation not included.

Web frameworks

Perl has plenty of awesome web frameworks, such as Catalyst, Dancer, Mojolicious, and many others, to let you write your web application the way you want. Each has its own set of advantages and disadvantages, but that is not the point of this article.

The point is that using those to run a blog or the framework's marketing^Whome page may seem wasteful, as there's probably little need to regenerate a page for each request, no matter if the content has changed or not.

Static web sites made with web frameworks

PSGI is an interface between web servers and applications written in Perl. The Plack implementation of PSGI is supported by most Perl web frameworks. It's also possible to write your own application (a PSGI application is just a subroutine) and connect it to any supported web server — and most web servers are supported.

After having tried to write my own static site generator, and having failed at making it as flexible as I would have liked (which in retrospect would probably have made it a web framework in itself), it seemed wiser to start building a site with one of those nice web frameworks and to use Plack as my entry point to get the to the content.

wallflower is a command-line tool that takes a PSGI application, and uses Plack to access to the content and save it to local files, ready to be uploaded to your static web server.

After obtaining the coderef for your application, it repeatedly creates the PSGI environment for the URL you want to process and runs your app on it (using Plack::Util::run_app), saving the response content to a local file. If the response content type is text/html or text/css, it will automatically look for embedded links and add them to its queue, thus enabling auto-discovery of the entire web site.

The point of Wallflower is to let you write any static website using all the power of your favorite web framework. It also follows links inside your Plack application, so if your site is properly organized, you only need to point it to /.

Blogging statically with your favorite framework

The obvious example for this would be to write a blog. I'll use Dancer, because it's the only web framework I know, but keep in mind that this will work with any PSGI-compliant framework. You could actually write your own PSGI application, if no existing framework suited you.

Since our target is a static web site, the main thing to keep in mind is that the target web server will determine the content type by looking at the extension, each all of our URLs must have an extension.

The sources for our basic blog will be a set of text files in the public/ subdirectory, with the content written in Markdown. URL will simply be mapped to those files.

So, we start by writing a route to handle all URL ending with .html:

package ShyBlog;
use Dancer ':syntax';
use Text::Markdown;
use Path::Class qw( file );

my $m = Text::Markdown-&gt;new;

get qr{/(.*)\.html} =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;my ($file) = splat;
&nbsp;&nbsp;&nbsp;&nbsp;my $text = file( setting('public') =&gt; &quot;$file.txt&quot; )-&gt;slurp;
&nbsp;&nbsp;&nbsp;&nbsp;template 'blog', { content =&gt; $m-&gt;markdown($text) };
};

1;

Since we put our blog entries in the public/ directory, Dancer will automatically serve the source when we end the URL in .txt! And we didn't even need to write a route for that!

Now, we want to get any further than a single blog post, to showing a main page with the latest post, some side bars on every page pointing to the archives by month, and maybe a JSON file with all our tags for making a nice tag cloud in JavaScript, we have a bit of a problem: we need to know about all our blog's posts when generating any individual one.

Remember that our PSGI application is ultimately a subroutine that will be called repeatedly by wallflower, so we just have to make the needed data available to the subroutine by building the list of all posts, once and for all, during the initialisation phase of the application.

A simple call to File::Find will help us generate the list of all posts, from which we can create a data structure. In this example it's an array:

use File::Find;
use Path::Class;

my @entries;

find(
&nbsp;&nbsp;&nbsp;&nbsp;sub {

        # we only care about blog entries
        return if !/\.txt$/;

        # get a Path::Class::File for it
        my $file = file($File::Find::name);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $fh   = $file-&gt;openr;

        # parse a simple header using the kite secret operator
        chomp( my ( $title, $date, $tags ) = ( ~~&lt;$fh&gt;, ~~&lt;$fh&gt;, ~~&lt;$fh&gt; ) );

        # update the structure will all relevant information
        my $source = substr( $File::Find::name, length( setting('public') ) );
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;( my $url = $source ) =~ s/\.txt$/.html;

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;push @entries, {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;url    =&gt; '/' .
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;title  =&gt; $title,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;date   =&gt; $date,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tags   =&gt; [ split /\s*,\s*/, $tags ],
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;source =&gt; &quot;/$year/$month/$_.txt&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;};
&nbsp;&nbsp;&nbsp;&nbsp;},
&nbsp;&nbsp;&nbsp;&nbsp;setting( 'public' )
);

Actually, for simplicity, and integration with the framework, it would make sense to create a temporary SQLite database, with a few tables for blog entries meta-information, tags, etc. The code in the templates and some special routes (like the main page) can then use that database to fetch all the information they need.

Generating the website is now simply a matter of running:

    $ wallflower -a bin/app.pl -d /path/to/the/output/

wallflower will start browsing the application from / and will follow all links (from HTML and CSS files) to generate your site content.

You can then copy the content of output/ to the proper location on the target web server, and you're done!

Set-based DBIx::Class

2012-12-21T00:00:00Z

I've been using DBIx::Class for a few years, and I've been part of the development team for just a little bit less. Three years ago I wrote a Catalyst Advent article about the five DBIx::Class::Helpers, which have since ballooned to twenty-four. I'll be mentioning a few helpers in this post, but the main thing I want to describe is a way of using DBIx::Class that results in efficient applications as well as reduced code duplication.

(Don't know anything about DBIx::Class? Want a refresher before diving in more deeply? Maybe watch my presentation on it, or, if you don't like my face, try this one.)

The thesis of this article is that when you write code to act on things at the set level, you can often leverage the database's own optimizations and thus produce faster code at a lower level.

Set Based DBIx::Class

The most important feature of DBIx::Class is not the fact that it saves you time by allowing you to sidestep database incompatibilities. It's not that you never have to learn the exact way to paginate correctly with SQL Server. It isn't even that you won't have to write DDL for some of the most popular databases. Of course DBIx::Class does do these things. Any ORM worth it's weight in salt should.

Chaining

The most important feature of DBIx::Class is the ResultSet. I'm not an expert on ORMs, but I've yet to hear of another ORM which has an immutable[&dagger;] query representation framework. The first thing you must understand to achieve DBIx::Class mastery is ResultSet chaining. This is basic but critical.

The basic pattern of chaining is that you can do the following and not hit the database:

$resultset-&gt;search({
&nbsp;&nbsp;&nbsp;name =&gt; 'frew',
})-&gt;search({
&nbsp;&nbsp;&nbsp;job =&gt; 'software engineer',
})

What the above implies is that you can add methods to your resultsets like the following:

sub search_by_name {
&nbsp;&nbsp;&nbsp;my ($self, $name) = @_;

&nbsp;&nbsp;&nbsp;$self-&gt;search({ $self-&gt;current_source_alias . &quot;.name&quot; =&gt; $name })
}

sub is_software_engineer {
&nbsp;&nbsp;&nbsp;my $self = shift;

&nbsp;&nbsp;&nbsp;$self-&gt;search({
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$self-&gt;current_source_alias . &quot;.job&quot; =&gt; 'software engineer',
&nbsp;&nbsp;&nbsp;})
}

And then the query would become merely

$resultset-&gt;search_by_name('frew')-&gt;is_software_engineer

(microtip: use DBIx::Class::Helper::ResultSet::Me to make defining searches as above less painful.)

Relationship Traversal

The next thing you need to know is relationship traversal. This can happen two different ways, and to get the most code reuse out of DBIx::Class you'll need to be able to reach for both when the time arrises.

The first is the more obvious one:

$person_rs-&gt;search({
&nbsp;&nbsp;&nbsp;'job.name' =&gt; 'goblin king',
}, {
&nbsp;&nbsp;&nbsp;join =&gt; 'job',
})

The above finds person rows that have the job "goblin king."

The alternative to use "related_resultset" in DBIx::Class::ResultSet:

$job_rs-&gt;search_by_name('goblin_king')
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&gt;related_resultset('person')

The above generates the same query, but allows you to use methods that are defined on the job resultset.

Subqueries

Subqueries are less important for code reuse and more important in avoiding incredibly inefficient database patterns. Basically, they allow the database to do more on its own. Without them, you'll end up asking the database for data, then you'll send that data right back to the database as part of your next query. It's not only pointless network overhead but also two queries.

Here's an example of what not to do in DBIx::Class:

my @failed_tests = $tests-&gt;search({
&nbsp;&nbsp;&nbsp;pass =&gt; 0,
})-&gt;all;

my @not_failed_tests = $tests-&gt;search({
&nbsp;&nbsp;id =&gt; { -not_in =&gt; [map $_-&gt;id, @failed_tests] }, # XXX: DON'T DO THIS
});

If you got enough failed tests back, this would probably just error. Just Say No to inefficient database queries:

my $failed_tests = $tests-&gt;search({
&nbsp;&nbsp;&nbsp;pass =&gt; 0,
})-&gt;get_column('id')-&gt;as_query;

my @not_failed_tests = $tests-&gt;search({
&nbsp;&nbsp;id =&gt; { -not_in =&gt; $failed_tests },
});

This is much more efficient than before, as it's just a single query and lets the database do what it does best and gives you what you exactly want.

Christmas!

Ok so now you know how to reuse searches as much as is currently possible. You understand the basics of subqueries in DBIx::Class and how they can save you time. My guess is that you actually already knew that. "This wasn't any kind of ninja secret, fREW! You lied to me!" I'm sorry, but now we're getting to the real meat.

Correlated Subqueries

One of the common, albeit expensive, usage patterns I've seen in DBIx::Class is using N + 1 queries to get related counts. The idea is that you do something like the following:

my @data = map +{
&nbsp;&nbsp;&nbsp;%{ $_-&gt;as_hash },
&nbsp;&nbsp;&nbsp;friend_count =&gt; $_-&gt;friends-&gt;count, # XXX: BAD CODE, DON'T COPY PASTE
}, $person_rs-&gt;all

Note that the $_->friends->count is a query to get the count of friends. The alternative is to use correlated subqueries. Correlated subqueries are hard to understand and even harder to explain. The gist is that, just like before, we are just using a subquery to avoid passing data to the database for no good reason. This time we are just going to do it for each row in the database. Here is how one would do the above query, except as promised, with only a single hit to the database:

my @data = map +{
&nbsp;&nbsp;&nbsp;%{ $_-&gt;as_hash },
&nbsp;&nbsp;&nbsp;friend_count =&gt; $_-&gt;get_column('friend_count'),
}, $person_rs-&gt;search(undef, {
&nbsp;&nbsp;&nbsp;'+columns' =&gt; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;friend_count =&gt; $friend_rs-&gt;search({
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'friend.person_id' =&gt;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{ -ident =&gt; $person_rs-&gt;current_source_alias . &quot;.id&quot; },
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}, {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alias =&gt; 'friend',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;})-&gt;count_rs-&gt;as_query,
&nbsp;&nbsp;&nbsp;},
})-&gt;all

There are only two new things above. The first is -ident. All -ident does is tell DBIx::Class "this is the name of a thing in the database, quote it appropriately." In the past people would have written -ident using queries like this:

'friend.person_id' =&gt; \' = foo.id' # don't do this, it's silly

So if you see something like that in your code base, change it to -ident as above.

The next new thing is the alias => 'friend' directive. This merely ensures that the inner rs has it's own alias, so that you have something to correlate against. If that doesn't make sense, just trust me and cargo cult for now.

This adds a virtual column, which is itself a subquery. The column is, basically, $friend_rs->search({ 'friend.person_id' => $_->id })->count, except it's all done in the database. The above is horrible to recreate every time, so I made a helper: DBIx::Class::Helper::ResultSet::CorrelateRelationship. With the helper the above becomes:

my @data = map +{
&nbsp;&nbsp;&nbsp;%{ $_-&gt;as_hash },
&nbsp;&nbsp;&nbsp;friend_count =&gt; $_-&gt;get_column('friend_count'),
}, $person_rs-&gt;search(undef, {
&nbsp;&nbsp;&nbsp;'+columns' =&gt; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;friend_count =&gt; $person_rs-&gt;correlate('friend')-&gt;count_rs-&gt;as_query
&nbsp;&nbsp;&nbsp;},
})-&gt;all

::ProxyResultSetMethod

Correlated Subqueries are nice, especially given that there is a helper to make creating them easier, but it's still not as nice as we would like it. I made another helper which is the icing on the cake. It encourages more forward-thinking DBIx::Class usage with respect to resultset methods.

Let's assume you need friend count very often. You should make the following resultset method in that case:

sub with_friend_count {
&nbsp;&nbsp;&nbsp;my $self = shift;

&nbsp;&nbsp;&nbsp;$person_rs-&gt;search(undef, {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;'+columns' =&gt; {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;friend_count =&gt; $person_rs-&gt;correlate('friend')-&gt;count_rs-&gt;as_query
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;}
}

Now you can just do the following to get a resultset with a friend count included:

$person_rs-&gt;with_friend_count

But to access said friend count from a result you'll still have to use ->get_column('friend'), which is a drag since using get_column on a DBIx::Class result is nearly using a private method. That's where my helper comes in. With DBIx::Class::Helper::Row::ProxyResultSetMethod, you can use the ->with_friend_count method from your row methods, and better yet, if you used it when you originally pulled data with the resultset, the result will use the data that it already has! The gist is that you add this to your result class:

__PACKAGE__-&gt;load_components(qw( Helper::Row::ProxyResultSetMethod ));
__PACKAGE__-&gt;proxy_resultset_method('friend_count');

and that adds a friend_count method on your row objects that will correctly proxy to the resultset or use what it pulled or cache if called more than once!

::ProxyResultSetUpdate

I have one more, small gift for you. Sometimes you want to do something when either your row or resultset is updated. I posit that the best way to do this is to write the method in your resultset and then proxy to the resultset from the row. If you force your API to update through the result you are doing N updates (one per row), which is inefficient. My helper simply needs to be loaded:

__PACKAGE__-&gt;load_components(qw( Helper::Row::ProxyResultSetUpdate ));

and your results will use the update defined in your resultset.

Don't Stop!

This isn't all! DBIx::Class can be very efficient and also reduce code duplication. Whenever you have something that's slow or bound to result objects, think about what you could do to leverage your amazing storage layer's speed (the RDBMS) and whether you can push the code down a layer to be reused more.

[&dagger;] if it weren't for the fact that there is an implicit iterator akin to each %foo it would be 100% immutable. It's pretty close though!

Better Testing

2012-12-20T00:00:00Z

Moose is slow!

At least when testing. Moose's compile time speed isn't typically a problem when running things like web applications, since they only start up once, but tests frequently run many instances of the application in quick succession, and this can add quite a bit of time to the overall runtime of the test suite. This can in fact happen with a lot of different modules - Moose is just the most well known example, but any large module will have a similar effect.

If you look at what's actually happening though, all of this extra time is spent doing the same thing. The same code is loaded at the start, and then only after compilation is finished do things start to diverge (to run the actual tests themselves). There's no reason that the code that runs for use Moose should need to run multiple times during the test suite, since it always does the same thing, and so a lot of time could be saved by loading modules fewer times.

Test::Aggregate

In the past, people have attacked this problem by combining test files into fewer, bigger ones, or by using something like Test::Aggregate to automate this process. This is error-prone, because a lot of times tests can have global effects - installing subs into packages, creating classes, etc. We really do want tests to run in separate environments, to avoid allowing them to interfere with each other.

App::ForkProve

App::ForkProve solves this problem. It is a wrapper for App::Prove, which allows you to preload modules, and then instead of running each of the test files via fork and exec, it runs them via fork and eval. This way, the preloaded modules are already loaded in the current interpreter, and so when the test files are run, the use statement is just a no-op.

This actually works remarkably well - the OX test suite takes 30 seconds to run make test on my laptop, which decreases to 14 seconds under prove -rj5 -l t (since it runs the tests in parallel on multiple processors), but forkprove -rj5 -l -MOX -MOX::Request -MOX::Response -MOX::RouteBuilder::Code -MOX::RouteBuilder::ControllerAction -MOX::RouteBuilder::HTTPMethod t runs in just 3 seconds.

Tips and tricks

That command line did get a bit long though, and it's hard for people who aren't the developer to know what things are useful to preload. It may be useful to provide a module along with your test suite that does the job of loading all of the useful modules, so you only have to specify a single -M option. For instance, here is the contents of t/Preload.pm in the OX repository:

package t::Preload;
use strict;
use warnings;

use OX;
use OX::Request;
use OX::Response;
use OX::RouteBuilder::Code;
use OX::RouteBuilder::ControllerAction;
use OX::RouteBuilder::HTTPMethod;

1;

Now, you can just run forkprove -rj5 -l -Mt::Preload t to get the same effect.

Another useful trick is that since forkprove is entirely compatible with prove except for the -M option, you can replace prove with forkprove entirely, by adding an alias to your shell configuration:

  alias prove=&quot;forkprove&quot;

This way, prove will continue to work as it always has in the past, but if you specify any -M options, they will be preloaded.

Caveats

This isn't entirely free, however. One obvious place where this would cause problems is in test files which test to make sure certain modules don't get loaded in certain situations. If you preload those modules, those tests will start failing.

In addition, since the tests are running from forkprove itself, any calls to Carp::confess or similar will report a longer stacktrace than they would otherwise, because all of the App::ForkProve machinery is actually still on the call stack. This is not typically a problem, but can potentially cause failures if you are relying on matching the entire stacktrace in a test.

TAP is ugly!

So now we have our tests running nice and quickly, and we make a change in our actual code, and it causes some tests to fail. The trouble is, the actual causes of the failures can be obscured by all of the prove output, especially if it's running in parallel. It'd be nice to have an easily skimmable output that makes it much more apparent what is wrong.

A typical solution here is to run prove -l t, see the list of failures at the end, and run the test files individually with perl -Ilib t/failing-test.t. This isn't great though, since raw TAP isn't the easiest thing to read. Additionally, if your tests don't have descriptions, it can be quite hard to find the test you're looking for.

Test::Pretty

Test::Pretty modifies the TAP output in order to make it a lot more pleasant to read. It adds colored output, automatically generates a test description based on the line number and contents of tests if they don't have one. For instance:

In addition, it cleans up the output of subtests to make them easier to follow:

Tips and tricks

Another shell alias can make using this easier:

function t {
&nbsp;&nbsp;&nbsp;&nbsp;if [[ -d blib ]]; then
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;perl -Mblib -MTest::Pretty &quot;$@&quot;
&nbsp;&nbsp;&nbsp;&nbsp;else
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;perl -Ilib -MTest::Pretty &quot;$@&quot;
&nbsp;&nbsp;&nbsp;&nbsp;fi
}

This way, t t/foo.t will run the given test file, using blib if appropriate.

A Cache Present

2012-12-19T00:00:00Z

People love receiving cash for Christmas, but a cache is a much more useful gift for your performance-hungry web server or application.

Today we'll talk about CHI, a modern Cache Handling Interface for Perl -- sort of a DBI for caching.

USING CHI

Creating a cache looks like:

my $cache = CHI-&gt;new(
&nbsp;&nbsp;&nbsp;&nbsp;driver    =&gt; '...',
&nbsp;&nbsp;&nbsp;&nbsp;namespace =&gt; '...',
    # driver specific args
);

driver indicates the cache backend, which controls how the cache data will be stored. Available backends include Memory, File, BDB, Memcached, and Redis - see CPAN for a complete list - and creating your own driver is simple.

namespace is a string that keeps this cache from other caches on the same backend. Often it's the name of the caller's Perl package or script.

CHI honors the standard get/set API that most cache modules use:

# Try to get value from cache.
#
my $data = $cache-&gt;get($key);
if ( !defined $data ) {

    # Was not in cache. Compute $data here.
&nbsp;&nbsp;&nbsp;&nbsp;#
    $data = ...;

    # Store in cache with a 10 minute expiration time.
&nbsp;&nbsp;&nbsp;&nbsp;#
    $cache-&gt;set( $key, $data, &quot;10m&quot; );
}

It also provides an all-in-one compute API, which is shorter and less error-prone:

# Try to get value from cache; if missing, call the sub
# and store the returned value.
#
my $data = $cache-&gt;compute($key, &quot;10m&quot;, sub {
    # Compute and return value here
});

FEATURES

With CHI you get a lot of caching features under the tree, and you can use them no matter which backend you've chosen.

Automatic key/value serialization

You can store arbitrary values in the cache, including listrefs, hashrefs and combinations thereof; CHI will automatically serialize and deserialize them for you. Automatic compression over a certain size is also an option.

You can also use arbitrary references as cache keys, e.g.

my $key = [$pub_id, $article_id, $page_id];
my $data = $cache-&gt;get($key, ...);

This saves you from the tedious and failure-prone process of composing multiple values into a key. And if your key is too long or too weird for your driver, CHI will digest and/or escape it for you.

Multilevel caches

You can chain multiple caches together in various ways. For example, here we place a size-limited memory L1 cache in front of a memcached cache. CHI will look in the memory cache first; on a miss, it will consult memcached and write back the value into the memory cache.

my $cache = CHI-&gt;new(
&nbsp;&nbsp;&nbsp;&nbsp;driver   =&gt; 'Memcached',
&nbsp;&nbsp;&nbsp;&nbsp;servers  =&gt; [ &quot;10.0.0.15:11211&quot;, &quot;10.0.0.15:11212&quot; ],
&nbsp;&nbsp;&nbsp;&nbsp;l1_cache =&gt; { driver =&gt; 'Memory', global =&gt; 1,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;max_size =&gt; 1024*1024 }
);

Miss stampede avoidance

A miss stampede occurs when a popular cache item expires, and a large number of processes all rush to recompute it. CHI provides two ways to reduce or avoid this common cache problem - probablistic expiration (in which expiration occurs over a range, instead of a single fixed time) and busy locks (in which the first process sets a lock so that other processes know not to start recomputing).

Logging and statistics

You can tell CHI to log every cache hit, miss and set for debugging purposes. You can also tell CHI to output statistics about the performance of your caches, including the hit/miss rate and the average compute time for each namespace.

Happy caching all!

Synchronous Operations are So Outdated

2012-12-18T00:00:00Z

Understanding asynchronous events

The best way to explain why synchronous code can sometimes be daunting is to use an example from Real Life™. A single day in our lives can contain plenty of actions that make us cringe and growl. Take, for instance, trying to make a meal.

Imagine you're cooking. You wouldn't wait for the water to boil before you prepared the potatoes. Nor would you wait for the potatoes to be done before you started working on the salad.

Asynchronous programming means having multiple events happen at the same time. It allows you to get more things done while you're waiting for other things to happen.

The fundamental element of asynchronous programming is the callback, so let's review that first, and then take a look at some examples of async in code-land.

We will be using AnyEvent for this article, but the same principles exist in all other async frameworks.

Introduction to callbacks

Since multiple events run at the same time, the application (much like the spice) must flow. To make this work, whenever we start one event we include references to the code that should be run when it finishes or hits other milestones. Since the event then "knows" how to proceed on its own, it can start up and work in the background while the rest of the program continues on doing more things.

We're going to be using a technique that some are not familiar with: callbacks. Just to get you up to speed, let me start by explaining callbacks in a nutshell: callbacks are just references to subroutines.

These subroutines can be defined using names or they can be anonymous. We can call those subroutines by their reference instead of their name.

# callbacks to named subroutines
sub func { ... }
my $func_reference = \&amp;func;
$func_reference-&gt;(@arguments);

# callbacks to anonymous subroutines
my $func_reference = sub { ... };
$func_reference-&gt;(@arguments);

If we use sub to create a reference to a subroutine, we can pass the callback as a parameter directly, without saving it first:

sub some_cb_handler {
&nbsp;&nbsp;&nbsp;&nbsp;my $callback = shift;
&nbsp;&nbsp;&nbsp;&nbsp;$callback-&gt;(&quot;hello&quot;);
}

# We pass in the callback without ever giving it a name or making it
# globally accessible.
some_cb_handler( sub {
&nbsp;&nbsp;&nbsp;&nbsp;my $greeting = shift;
&nbsp;&nbsp;&nbsp;&nbsp;say &quot;$greeting, world!&quot;;
} );

Reading from input

You have an application that needs to read from a handle (which could be a file descriptor, a socket, or even the standard input), but you don't know when it will be ready to be read.

In a synchronous application, you'll be waiting for it to become available, possibly calling sleep in between. But these days, we're busy people, we can't just be waiting by the phone. We have stuff to do!

sub alert_action {
&nbsp;&nbsp;&nbsp;&nbsp;my $action = shift;
&nbsp;&nbsp;&nbsp;&nbsp;say &quot;New action found: $action&quot;;
}

my $io_watcher = AnyEvent-&gt;io(
&nbsp;&nbsp;&nbsp;&nbsp;fh   =&gt; $fh,
&nbsp;&nbsp;&nbsp;&nbsp;poll =&gt; 'r',
&nbsp;&nbsp;&nbsp;&nbsp;cb   =&gt; sub {
        # we can now read!
        my $input = &lt;$fh&gt;;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if ( $input =~ /^New action: (\w+)/ ) {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;alert_action($1);
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;},
);

# continue to do something else

How does that work? By calling AnyEvent's io method, you're creating a new watcher that checks a file handle for new read events. If it has something to read, it will call the code reference we provided. Both the checks and the subroutine call will happen in the background.

Also, since we've given it all the information it needs (which file handle to poll, what kind of events we want, and what to do in that case), it doesn't need to hold us back. That way we can continue with some other code, and the watcher will wait and run the background, without bothering us.

Keeping the watchers alive

There is a problem I haven't mentioned. That code is fine, except that once it executes the additional code, the application will close, simply because it reached the end of the file. We want to keep the application running, so our watchers will continue to work. How do we do that? Condition variables!

Condition variables are variables that represent a condition waiting to come true, like your cat waiting for you to get comfortable with a laptop. When the variable becomes true, the cat comes over and lies on your lap, disrupting your work.

my $done    = AnyEvent-&gt;condvar;
my $watcher = AnyEvent-&gt;io(
&nbsp;&nbsp;&nbsp;&nbsp;fh   =&gt; $fh,
&nbsp;&nbsp;&nbsp;&nbsp;poll =&gt; 'r',
&nbsp;&nbsp;&nbsp;&nbsp;cb   =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $input = &lt;$fh&gt;;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;...
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if ( $input =~ /^End of processing file/ ) {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$done-&gt;send;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;&nbsp;&nbsp;},
);

...
$done-&gt;recv;
say &quot;All done!&quot;;

This time we created a condition variable that is available to the watcher. The watcher still dilligently continues its work. Only this time as soon as it finds a line in the file indicating the end of it, it will call send on the condition variable, making the condition true, effectively saying "that's it, we're done".

If someone has called recv on the condition variable, it will wait until something else in the background (like our watcher) will call send and then will continue running.

That means that the line "All done!" will only get written once our worker finished reading the line.

Another ramification of the condition variable's behavior is that it is possible to create an infinite loop by creating a condition variable, calling recv, and not having anything call send on it. It looks exactly like this:

my $cv = AnyEvent-&gt;condvar;
$cv-&gt;recv;

# or in short
AnyEvent-&gt;condvar-&gt;recv;

Since the application is now waiting for a condition variable to come true, it will not terminate. Because nothing can call send on this variable, it basically means the application will stay up indefinitely. The most common usage for this are daemons, which should always be running.

Timing your cooking

The last element in AnyEvent that we'll be looking at is the timer. Timers are events (any kind of event) that gets run at some point in time. It can be in a few minutes from now or at a specific hour. It can happen once or it can repeat itself several times, or even forever.

my $timer = AnyEvent-&gt;timer(
&nbsp;&nbsp;&nbsp;&nbsp;after    =&gt; 3.5,
&nbsp;&nbsp;&nbsp;&nbsp;interval =&gt; 5,
&nbsp;&nbsp;&nbsp;&nbsp;cb       =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;say &quot;Ping? Pong!&quot;;
&nbsp;&nbsp;&nbsp;&nbsp;},
);

This defines a timer that will wait 3.5 seconds, and then call the subroutine every 5 seconds. Fairly simple. Let's try a few timers.

my @steps        = qw&lt;Cutting Simmering Cooking Seasoning Serving&gt;;
my $current_step = 'Preparing';

my $done = AnyEvent-&gt;condvar;
my $t1   = AnyEvent-&gt;timer(
&nbsp;&nbsp;&nbsp;&nbsp;interval =&gt; 60 * 7,
&nbsp;&nbsp;&nbsp;&nbsp;cb       =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;say &quot;Current cooking state: $current_step&quot;;
&nbsp;&nbsp;&nbsp;&nbsp;},
);

my $t2 = AnyEvent-&gt;timer(
&nbsp;&nbsp;&nbsp;&nbsp;after    =&gt; 2,       # two seconds to wash hands before working!
&nbsp;&nbsp;&nbsp;&nbsp;interval =&gt; 60 * 10, # assuming every action takes 10 minutes
&nbsp;&nbsp;&nbsp;&nbsp;cb       =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$current_step = shift @steps or return $done-&gt;send;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;do_step($current_step);
&nbsp;&nbsp;&nbsp;&nbsp;},
);

$done-&gt;recv;
say &quot;Dinner is served!&quot;;

What we have here isn't the best example for how to make a meal, but it does give us an example showing multiple timers. The first timer ($t1) keeps alerting us every seven minutes about our progress. Meanwhile, our second timer picks up an action to do every 10 minutes, and does it. Once no more actions are available, it tells the condition variable that it's done. It does this by simply returning out of the subroutine (so we don't call do_step again) and calling send at the same time.

After we created our timers, we set up a recv on a condition variable, meaning "don't continue running the rest of the application until we are notified that the timers finished their work". It will wait in that point in time (without blocking the timers) until the send is called. Then it will continue and say dinner is finally served. Since it's the end of the application, the timers will close and the application will end.

Here is the output we'll get from running the application:

  Current cooking state: Preparing
  (do_step() called with &quot;Cutting&quot;)
  Current cooking state: Cutting
  (do_step() called with &quot;Simmering&quot;)
  Current cooking state: Simmering
  Current cooking state: Simmering
  (do_step() called with &quot;Cooking&quot;)
  Current cooking state: Cooking
  (do_step() called with &quot;Seasoning&quot;)
  Current cooking state: Seasoning
  (do_step() called with &quot;Serving&quot;)
  Current cooking state: Serving
  Current cooking state: Serving
  Dinner is served!

Condition variables with multiple calls

Sometimes the behavior of the condition variable's send and recv is not flexible enough to handle instances in which you need to be able to wait on multiple calls.

Suppose you have a calculation to do that depends on the result of multiple database queries. Before the SQL experts jump at it, let's also suppose these queries are made across different databases.

A database connection is in fact a network operation, which means it blocks. This is an ideal example for async programming. You could initiate several connections and queries concurrently instead of consequtively. Using condition variables, you would probably try to open three condition variables, and then waiting for each to come true. That won't work, since you can only call recv on one variable at a time.

Instead, condition variables can accept a begin and end call to signify a multi-call request. Once there's been an end call for each begin call, it will return to the recv method.

my $cv  = AnyEvent-&gt;condvar;
my $sum = 0;
foreach my $db (@dbs) {
    # beginning an event
    $cv-&gt;begin;

&nbsp;&nbsp;&nbsp;&nbsp;$db-&gt;query( $query, sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $amount = shift;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$sum += $amount;

        # finishing an event
        $cv-&gt;end;
&nbsp;&nbsp;&nbsp;&nbsp;},
}

$cv-&gt;recv;
say &quot;All database queries finished.&quot;;

Bringing it all together

After we've gone over a few elements of AnyEvent, we can build a small useful application. We'll add a few more elements such as AnyEvent::HTTP, Regexp::Common, and File::Basename.

Suppose we have a file that has contains a lot of links and we want to download every image listed in it. These are two different actions: (1) reading the file and (2) downloading the images. We will also have a timer that gives us the progress every two seconds.

use AnyEvent;
use AnyEvent::HTTP;
use Regexp::Common 'URI';
use File::Basename 'basename';
use autodie;

my $counter   = 0;
my $cv        = AnyEvent-&gt;condvar;
my $fh        = open my $fh, '&lt;', 'links.txt';
my $fhwatcher = AnyEvent-&gt;io(
&nbsp;&nbsp;&nbsp;&nbsp;fh   =&gt; $fh,
&nbsp;&nbsp;&nbsp;&nbsp;poll =&gt; 'r',
&nbsp;&nbsp;&nbsp;&nbsp;cb   =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $line = &lt;$fh&gt;;

        # ignoring lines that aren't HTTP URIs
        $line =~ /^$RE{URI}{HTTP}$/ or return;

        # call an HTTP request
        $cv-&gt;begin;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;http_get $line, sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $body     = shift;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $filename = basename($line);

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;syswrite $filename, $body ?
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$counter++            :
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;or warn &quot;Couldn't write to $filename: $!&quot;;

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;$cv-&gt;end;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;};
&nbsp;&nbsp;&nbsp;&nbsp;},
);

my $progress = AnyEvent-&gt;timer(
&nbsp;&nbsp;&nbsp;&nbsp;after    =&gt; 2, # giving it two seconds before starting
&nbsp;&nbsp;&nbsp;&nbsp;interval =&gt; 2, # report every two seconds
&nbsp;&nbsp;&nbsp;&nbsp;cb       =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;printf &quot;[%s] Update: finished downloading $counter images.\n&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;scalar AnyEvent-&gt;now;
&nbsp;&nbsp;&nbsp;&nbsp;},
);

$cv-&gt;recv;
close $fh;
say &quot;Finished downloading all files&quot;;

Let's analyze what we've got here. We use some modules that you should recognize. If you don't, you should check them out.

The next thing is opening a file handle. We then set up a watcher for some I/O operations using AnyEvent's io method. It needs the file handle we are going to operate on, and the kind of operation we'll do (we pick r for reading) and a callback to run. This callback is the main thing that takes a bit to understand.

Every time we read a line that has a URL, we call begin on the condition variable. We issue an HTTP request for that URL and once we finish fetching it and saving it, we issue the corresponding end call. When all begin calls have ended, it will return to the recv method, much like calling send.

We also created a progress timer that announces, every two seconds, the number of links we've sent. You'll notice it uses AnyEvent's now, which is the recommended way to call time when running in an event loop.

The recv call in the end will wait until all begin calls will be closed. Once we've worked on the entire file, it will print a nice message and the application will end.

Just the beginning...

Once you get used to programming asynchronously, it's like having scissors: you just run with it! Note: Do not run with scissors. ✂ 🏃

Try out an event framework and see how much fun it is for yourself. Perl has many to offer, such as AnyEvent, POE, IO::Async, Reflex, IO::Lambda, Coro, and more.

Santa Has Dependencies Too

2012-12-17T00:00:00Z

In the old days, Santa's elves would build every toy from scratch, but now he outsources most of the parts for the toys. Naturally, he has created a sophisticated supply-chain management system to ensure that each toy is consistently built from the same parts.

The same is true for software development. These days, our applications depend on lots of frameworks and libraries. So we also need to manage the supply of those dependencies to ensure that every build has the same "parts."

Pinto helps you manage your supply of dependencies by creating a custom repository of Perl modules. The repository is fully compatible with CPAN installers (e.g. cpan, cpanm, cpanp), but unlike the public CPAN, the modules in your Pinto repository only change when you want to change them. You'll get the exact same result each and every time you build.

The pinto command line utility does all the work of creating the repository, and provides some helpful tools for managing change as your dependencies evolve over time. Let's take a look at some of the things you can do...

First, let's create a repository. All you need is a directory where the repository will live (we'll use ~/my_repo here) and the name of the stack (we'll use prod here). A stack is just a named subset of modules in your repository (more on that later). Here's what the command would look like:

  $ pinto -r ~/my_repo init --stack=prod

Suppose we want to use Catalyst for a new application. Let's get it from the CPAN and put it in our Pinto repository. This command will put the latest (at this moment) version of Catalyst and all of its dependencies into our Pinto repository:

  $ pinto -r ~/my_repo pull Catalyst

To install Catalyst, we just point cpanm (or cpan or cpanp) at the stack inside the repository. Every time we do this, we'll get exactly the same version of Catalyst and its dependencies, even if newer versions have been released to the public CPAN:

  $ cpanm --mirror=file:///home/jeff/my_repo/prod --mirror-only Catalyst

From time to time, Santa decides to upgrade the parts used to build a toy, or even switch to a new parts supplier entirely. To ensure quality, Santa always sets up a separate assembly line for the elves to test the new parts before committing them to mass production.

With Pinto, you can do the same thing. Suppose that Catalyst 4.0 is released to the CPAN and we want to try upgrading our application, which now has several other dependencies of its own. We can make an experimental duplicate of those dependencies by copying the stack like this:

  $ pinto -r ~/my_repo copy prod catalyst-upgrade

Any changes we make to the "catalyst-upgrade" stack are completely separate from the "prod" stack. So we can now go ahead and upgrade Catalyst (and whatever new modules it may require) like this:

  $ pinto -r ~/my_repo pull --stack=catalyst-upgrade Catalyst~4.0

To test our upgraded application dependencies, we just make a new build by pointing cpanm at the "catalyst-upgrade" stack inside the repository:

  $ cpanm --mirror=file:///home/jeff/my_repo/catalyst-upgrade --mirror-only Catalyst

If our application (and all of its dependencies) build cleanly then we can just merge two stacks together and throw away the experimental stack:

  $ pinto -r ~/my_repo merge catalyst-upgrade prod
  $ pinto -r ~/my_repo delete catalyst-upgrade

Occasionally, Santa's elves find that a new version of a part is flawed or just not compatible with current their line of toys. Since the workshop is pretty big, it can be hard to ensure that every elf foreman doesn't mistakenly order the new (flawed) part for his assembly line. So Santa keeps a real-time blacklist of all the part numbers that are not allowed in the workshop.

This happens all the time in software development, so Pinto allows you to "pin" the modules in your repository, which prevents them from being upgraded. Suppose we already have Plack 2.0 in our Pinto repository and we learn that Plack 3.0 is not compatible with our application. So we can pin Plack to let everyone know that it can't be upgraded yet:

  $ pinto -r ~/my_repo pin Plack

If anyone tries to upgrade Plack directly or to satisfy the prerequisites for some other module, then Pinto will refuse to comply. Once you've resolved the problem, then you can unpin Plack and upgrade it as needed.

Keeping lists of all the naughty and nice children is huge task, so Santa has become very good at record keeping. He also keeps excellent records of everything that happens in the workshop. This helps him to identify the critical links in his supply chain or reward deserving elves.

Pinto keeps records too, so you can see what's in the repository right now and how it has changed over time. Here are some of the things you can do:

  # Show all the modules in the stack right now:
  $ pinto -r ~/my_repo list

  # Show who&#39;s responsible for the current modules in the stack:
  $ pinto -r ~/my_repo blame

  # Show how and why the stack has changed over time:
  $ pinto -r ~/my_repo log --detailed

As you can imagine, Santa Claus has pretty much perfected the science of supply-chain management, so when it comes to managing our supply of module dependencies, we software developers could probably learn a lot from him. Perhaps Pinto should have been called "Donner" or "Vixen."

Creating Your Own Perl

2012-12-16T00:00:00Z

use List::Util qw( reduce );

my @numbers  = 1 .. 10;
my $even_sum = reduce { $a + $b } grep { $_ % 2 == 0 } @numbers;

See what I did there? Unlike some functional programming languages, Perl doesn't have a built-in fold or reduce keyword, so I cleverly imported the reduce function from List::Util. (Of course, if I'd been really clever, I'd have noticed List::Util also has a sum function available.)

Due to some trickery with sub prototypes and manipulating its caller's symbol table, List::Util manages to make its reduce function feel just like a built-in language feature. It uses the same codeblock syntax as grep and map, and the same magic $a and $b variables as sort.

Via tricks like these, plus ties, overloads, custom import functions, source filters, Devel::Declare, %^H, and (in newer versions of Perl) the pluggable keyword API, Perl modules have the power to affect their caller in ways far beyond the mechanisms that other programming languages make available. When you use a module that does this, you're not just loading a library and using it at arm's length; you're changing the very syntax of Perl - lexically, within your module.

When starting a new script, or a new module, this is what we do. We add a bunch of use statements to the top of the file to tweak Perl's flavour to our liking. We make Perl a more suitable language for getting the job done; we turn a general purpose programming language into a domain-specific language suitable for our exact task. This will often begin with something like:

use v5.14;
use strict;
use warnings;

but if you're writing anything non-trivial, it's likely that a bunch of other use statements will join them.

(Of course, some modules are plain old object-oriented code that make no attempt to alter their caller's syntax. Different approaches are appropriate for different tasks.)

Twelve Lords A Leaping

Here are some of my favourite syntax-bending modules:

List::Util / List::MoreUtils

List::Util is a core Perl module with a small collection of array munging functions; List::MoreUtils is a collection of extras that didn't quite make the shortlist.

Many of these make creative use of sub prototypes to look and act like Perl's built-in list manipulation functions. The first, uniq and reduce functions are especially useful, and should be in every Perl programmer's toolkit.

PerlX::Maybe

PerlX::Maybe provides a tiny function making it easier to work with optional named parameters, a la:

my $santa = Person-&gt;new(
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;title     =&gt; &quot;Saint&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;name      =&gt; &quot;Nicholas&quot;,
&nbsp;&nbsp;maybe telephone =&gt; $phone,      # $phone might be undef
&nbsp;&nbsp;maybe email     =&gt; $email,      # $email might be undef
);

Syntax::Keyword::Junction

Syntax::Keyword::Junction implements support for something approaching the Perl 6 concept of junctions; that is, variables which have multiple values at once.

my $reindeer = any(qw/
&nbsp;&nbsp;Dasher Dancer Prancer Vixen Comet Cupid Donder Blitzen
/);
$reindeer eq &quot;Dasher&quot;;    # true
$reindeer eq &quot;Prancer&quot;;   # true
$reindeer eq &quot;Rudolf&quot;;    # false

It achieves this with nothing more than careful use of overloading.

aliased

aliased provides short aliases for long class names.

use aliased &quot;Rangifer::Tarandus&quot; =&gt; &quot;Reindeer&quot;;

my $rudolf = Reindeer-&gt;new;

The short alias is just a constant that returns the class name as a string. Simple idea, but useful.

Safe::Isa

Ever get the Can't call method "isa" on an undefined value blues? Safe::Isa gives you a way to call methods like isa and can on scalars without checking that they are defined and blessed.

# Might return undef if there are no cheap desserts
my $pudding = $menu-&gt;find_food(max_price =&gt; 5, category =&gt; DESSERT);

if ($pudding-&gt;$_isa('Plum::Pudding')) {
&nbsp;&nbsp;say &quot;Yum!&quot;;
}

It takes advantage of the fact that coderefs may be called as methods even on unblessed or undefined invocants.

Try::Tiny

While eval and $@ can be used as a try-catch mechanism in Perl, there are numerous gotchas. Try::Tiny works around them for you, giving you a nice syntax for exception catching.

try {
&nbsp;&nbsp;$gift-&gt;give($recipient);
}
catch {
&nbsp;&nbsp;when (/^Can't call method &quot;give&quot;/) { } # ignore
&nbsp;&nbsp;default { die $_ }
};

There are even nicer modules like TryCatch available, but Try::Tiny's zero-dependency approach - it uses nothing more than prototypes and guards (dummy objects with just a destructor) - is perfect for even small projects.

NEXT adds a SUPER-like pseudo-class to your module, but with more control of method redispatch than SUPER gives you. Good if you're programming with multiple inheritance.

These days you should probably use mro instead, but NEXT deserves a mention for its clever use of AUTOLOAD and capitalised package names to create the illusion of new syntax.

Web::Simple

This Plack-based web app framework uses a sub prototype hack for dispatching.

sub (POST + /naughty_list/person+ ?name=&amp;*) {
&nbsp;&nbsp;my ($self, $name, $misc_params) = @_;
&nbsp;&nbsp;...;
}

autovivification

Perl's autovivification feature can sometimes be counterintuitive.

my $menu = undef;
exists $menu-&gt;{plum}{pudding};   # false
exists $menu-&gt;{plum};            # true !!!

The autovivification module can selectively disable autovivification for particular scopes, or get Perl to issue a warning or fatal error when autovivification occurs. Very handy.

Lots of deep XS magic in this module.

PerlX::QuoteOperator

Perl has various built-in quote-like operators. qw() constructs arrays; qr() quotes regular expressions and qx() acts like backticks. PerlX::QuoteOperator allows you to define your own.

use PerlX::QuoteOperator qdeer =&gt; {
&nbsp;&nbsp;-with =&gt; sub ($) { Reindeer-&gt;new(name =&gt; $_[0]) },
};

my $rudolf = qdeer(Rudolf);

PerlX::QuoteOperator uses Devel::Declare to rewrite qdeer(...) to qdeer qq(...) while Perl is compiling your code.

Function::Parameters

Function::Parameters provides parameter lists for Perl subs. Instead of:

sub give {
&nbsp;&nbsp;my ($gift, $recipient) = @_;
&nbsp;&nbsp;...;
}

You can write:

fun give ($gift, $recipient) {
&nbsp;&nbsp;...;
}

It supports named and positional parameters, optional parameters and methods with invocants. It provides an introspection API, and if you're using Moose, then it can hook into the Moose type constraint system to validate parameter types. Such fun!!

Function::Parameters uses Perl's new(ish) pluggaable keyword API, so is only available for Perl 5.14 and above.

MooseX::Declare

Where to start? MooseX::Declare gives you class and role keywords for declaring Moose classes and Moose roles; extends, with and is for inhertitance, role composition and meta traits; method for declaring methods with signatures; before, after, around, override and augment for method modifiers; and clean for scrubbing away helper functions so that outside code can't call them.

role Flight
{
&nbsp;&nbsp;method fly (DateTime $when, Location $where) {
&nbsp;&nbsp;&nbsp;&nbsp;...;
&nbsp;&nbsp;}
}

class MagicReindeer extends Reindeer with Flight
{
&nbsp;&nbsp;before fly (DateTime $when, Location $where) {
&nbsp;&nbsp;&nbsp;&nbsp;TimingException-&gt;throw(&quot;not Christmas Eve&quot;)
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unless $when-&gt;month == 12 &amp;&amp; $when-&gt;day == 24;
&nbsp;&nbsp;}
}

It uses Devel::Declare. Extensively. And a partridge in a pear tree.

Bundle Up!

If you're working on a large project with many modules, you may find that you are repeating the same set of imports at the top of almost every file. Perhaps something like:

use v5.14;
use strict;
use warnings;
use Try::Tiny;
use Scalar::Util qw( blessed );
use List::Util qw( first reduce );
use List::MoreUtils qw( uniq );
use Path::Class qw( file dir );

OK, so you can copy and paste, but copy-paste code is the enemy. Don't repeat yourself. Wouldn't it be nice to bundle up all the above functionality into a single module?

use My::Syntax;

Well, here's an example of how you could write that module:

package My::Syntax;

use v5.14;
use strict;
use warnings;
use Try::Tiny qw();
use Scalar::Util qw();
use List::Util qw();
use List::MoreUtils qw();
use Path::Class qw();
use import::into;

sub import {
&nbsp;&nbsp;my $caller = caller;
&nbsp;&nbsp;feature-&gt;import::into($caller, ':5.14');
&nbsp;&nbsp;strict-&gt;import::into($caller);
&nbsp;&nbsp;warnings-&gt;import::into($caller);
&nbsp;&nbsp;Try::Tiny-&gt;import::into($caller);
&nbsp;&nbsp;Scalar::Util-&gt;import::into($caller, 'blessed');
&nbsp;&nbsp;List::Util-&gt;import::into($caller, 'first', 'reduce');
&nbsp;&nbsp;List::MoreUtils-&gt;import::into($caller, 'uniq');
&nbsp;&nbsp;Path::Class-&gt;import::into($caller, 'file', 'dir');
}

1;

Alternatively, Syntax::Collector makes it a little neater:

package My::Syntax;

use v5.14;
use Syntax::Collector -collect =&gt; q/
use strict           1.00         ;
use warnings         1.00         ;
use feature          1.00         qw( :5.14 );
use Try::Tiny        0.11         ;
use Scalar::Util     1.23         qw( blessed );
use List::Util       1.23         qw( first reduce );
use List::MoreUtils  0.33         qw( uniq );
use Path::Class      0.26         qw( file dir );
/;

1;

Yes, that's a big quoted string (q/.../), but no, it's not just evaled.

Bundling up imports into a single module makes it easier to encourage project-wide coding standards. You can't "forget" to enable warnings any more (but of course you can explicitly unimport it). You no longer have any excuse for using ref when you mean blessed, or grep when you want first.

Bundling up imports allows you to consider ideas like true.pm which would seem ridiculous if you needed to repeat them at the top of every file, but become more appealing if they are included as part of an import collection.

And bundling up imports allows you to manage your project's dependencies from a single place. Don't want to depend on List::MoreUtils any more? Then write your own replacement for uniq and get My::Syntax to export that instead. (The Syntax::Collector documentation includes examples of how to write a syntax collection that also acts as an exporter.)

So go on; create your own Perl. Make it your gift to yourself.

Gift Wrapping, part II: Locking the Room

2012-12-15T00:00:00Z

Continuing on the topic of gift wrapping, another traditional manoeuver to wrap gifts in peace consist on locking yourself in a room (typically, with a sign reading Do Not Enter on the door) as you perform the deed.

With programs, you'll want to do the same thing if your program should only have one instance running at any given time. You want to have a lock file, but then you have to see how arcane things like flock works, think about cross-platform issues… or, maybe, you could use File::Flock::Tiny:

use File::Flock::Tiny;

my $lock = File::Flock::Tiny-&gt;write_pid('/tmp/bedroom') 
&nbsp;&nbsp;&nbsp;&nbsp;or die &quot;somebody else is hogging the wrapping space already&quot;;

wrap_presents();

# all done
$lock-&gt;release;

Niftier still, it turns out that the above example is overkill, because File::Flock::Tiny will automatically release the lock when its $lock object goes out of scope. Knowing that, the $lock->release line is not necessary. This auto-release trick plays also very nicely with Moose. Want to have a script with lockfile functionality? Here goes:

package Gift::Wrapping;

use Moose;

has lockfile =&gt; (
&nbsp;&nbsp;&nbsp;&nbsp;is      =&gt; 'ro',
&nbsp;&nbsp;&nbsp;&nbsp;isa     =&gt; 'Str',
&nbsp;&nbsp;&nbsp;&nbsp;default =&gt; '/tmp/bedroom',
);

has lock =&gt; (
&nbsp;&nbsp;&nbsp;&nbsp;is =&gt; 'ro',
&nbsp;&nbsp;&nbsp;&nbsp;isa =&gt; 'File::Flock::Tiny::Lock',
&nbsp;&nbsp;&nbsp;&nbsp;lazy =&gt; 1,
&nbsp;&nbsp;&nbsp;&nbsp;default =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my $self = shift;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;File::Flock::Tiny-&gt;write_pid($self-&gt;lockfile) 
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;or die &quot;resource already locked\n&quot;;
&nbsp;&nbsp;&nbsp;&nbsp;},
);

before gather_presents =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;my $self = shift;
&nbsp;&nbsp;&nbsp;&nbsp;$self-&gt;lock;
};

...

__PACKAGE__-&gt;meta-&gt;make_immutable;
1;

With that, your script will judicously lock itself in the bedroom before taking out the presents from their secret place. As soon as the object is done and get out of scope (most likely as the the script terminates), the lock will be automatically removed.

Oh. This being said, check under the bed. Chances are that locking the room, no matter how cleverly, won't help you if your tikes are already hidding under the bed. Just saying…

Self-contained applications

2012-12-14T00:00:00Z

In No-Dependency Land

While the proliferation of solutions like local::lib and cpanminus has made it a breeze to manage dependencies, there are still some rare occassions in which we need to be able to ship code that has no external non-core dependencies.

There are a few existing solutions for them, but we're going to concentrate on a new one called FatPacker.

Our application

Of course, we just happen to have a sample application we want to pack. It downloads various pages from our website and compiles a statistics report. It uses HTTP::Tiny as a user agent. Our application begins with the lines:

#!/usr/bin/perl
use strict;
use warnings;
use HTTP::Tiny;

Our app is, surprisingly, saved as the file ourapp.pl.

Packing the deps

App::FatPacker comes with an application called fatpack. You'll use fatpack to get at all of App::FatPacker's features. There are four simple steps for packing your dependencies. Let's go over them.

Tracing

To find out what dependencies our code has, we trace our app. This will create a file called fatpacker.trace, which includes a list of modules that fatpack has discovered.

$ fatpack trace ourapp.pl

In case some modules aren't successfully traced, you can ask fatpack to include them:

$ fatpack trace --use=Additional::Module ourapp.pl

If we open the fatpacker.trace file, we can see it collected a few modules, including both HTTP/Tiny.pm and Carp.pm (which HTTP::Tiny uses).

Gathering packlists

Packlists are files that distributions install. They contain information on which modules are included in the distribution. FatPacker needs to find the packlist for each module in order to make sure it includes all dependencies recursively and does not miss anything. One module is likely to use another module, which might use another module in turn, and so on.

We can call packlists-for with a list of modules, or we can feed it the content of the trace output we created with the previous command. It will print out a list of all the packlists, which we'll simply redirect to a file so we can reuse this information.

$ fatpack packlists-for `cat fatpacker.trace` &gt; packlists

The packlists file will include the path to the packlists of Carp and HTTP::Tiny.

Forming the tree

In this step FatPacker collects all the dependencies recursively into a directory called fatlib, which it will then be able to pack together.

tree needs a list of packlists. Lucky for us, we saved the packlists that our previous command has found in a file called packlists. Let's just call tree and feed it that file.

$ fatpack tree `cat packlists`

Taking a look at our fatlib directory, we'll see the following structure:

    fatlib/
    &#x251C;&#x2500;&#x2500; Carp
    &#x2502;&nbsp;&nbsp; &#x2514;&#x2500;&#x2500; Heavy.pm
    &#x251C;&#x2500;&#x2500; Carp.pm
    &#x2514;&#x2500;&#x2500; HTTP
        &#x2514;&#x2500;&#x2500; Tiny.pm

You can clearly see it added HTTP::Tiny and Carp, but you can also see it added Carp::Heavy which comes with Carp. This is what recursively copying dependencies means.

Packing dependencies

Once we have all our dependencies in a directory, we can finally pack it all nicely using the last command: file. This command packs all the modules in the current fatlib directory. It will also try to pack any lib directory that exists in the current directory. If none is present, you will need to create it.

Since the command only packs the modules, we're still missing our code that uses them, so we will concatenate that as well. We will also print this to a new file so we could ship it.

$ (fatpack file; cat ourapp.pl) &gt; ourapp.packed.pl

Stick a shebang line at the top of ourapp.packed.pl and that's all there is to it!

You can now ship ourapp.packed.pl to any location, and it will include all dependencies recursively.

You can open our newly-packed application file and see the way it has packed everything together:

BEGIN {
&nbsp;&nbsp;&nbsp;&nbsp;my %fatpacked;

&nbsp;&nbsp;&nbsp;&nbsp;$fatpacked{&quot;Carp.pm&quot;} = &lt;&lt;'CARP';
        ... # entire Carp
&nbsp;&nbsp;&nbsp;&nbsp;CARP

&nbsp;&nbsp;&nbsp;&nbsp;$fatpacked{&quot;Carp/Heavy.pm&quot;} = &lt;&lt;'CARP_HEAVY';
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;... # entire Carp::Heavy
&nbsp;&nbsp;&nbsp;&nbsp;CARP_HEAVY

&nbsp;&nbsp;&nbsp;&nbsp;$fatpacked{&quot;HTTP/Tiny.pm&quot;} = &lt;&lt;'HTTP_TINY';
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;... # entire HTTP::Tiny
&nbsp;&nbsp;&nbsp;&nbsp;HTTP_TINY

&nbsp;&nbsp;&nbsp;&nbsp;# fixing of @INC to load these
&nbsp;&nbsp;&nbsp;&nbsp;...
} # END OF FATPACK CODE
#!perl
use strict;
use warnings;
use HTTP::Tiny;

# rest of our code
...CARP

It's already being used!

There is at least one famous project which uses this method to create a self-contained program: cpanminus proved this method to be useful for beginners and seasoned system administrators in providing a self-contained full-fledged CPAN client, always available at your finger-tips without any installations required (other than having a Perl interpreter, of course).

You can always download a packed cpanminus program and use it, wherever you are, using the following command:

$ curl -kL cpanmin.us &gt; cpanm
$ perl cpanm Some::Module

Caveat

There are some considerations still:

Compile time code will be run

If you have any compile-time code (think BEGIN blocks), they will be run as part of the tracing step. Generally, these aren't recommended for most use cases anyway.

If you have any compile-time code which shouldn't run upon tracing, you might want to consider refactoring it into run-time code.

Lazily loaded modules won't be found

Any modules that are loaded lazily (such as require statements) will not be traced successfully. You can, however, provide them as additional modules for the trace command, as described above.

XS modules are not supported

App::FatPacker only supports Pure-Perl modules, so if you're using any XS modules, you'll need to have them installed remotely.

Take a little REST

2012-12-13T00:00:00Z

About six months ago I learned about resty. I think Stevan Little may have mentioned it in his Web::Machine talk. Unfortunately resty had problems running in zsh. I initially tried to fix the problem, but then I ported it to Perl instead, which was not only easier but also ended up having a lot of other exciting benfits.

What even is that thing?

Adenosine is a tool that allows you to fiddle with RESTful services easily. The basic gist is that you can use HTTP verbs (POST, PUT, HEAD, GET, OPTIONS, TRACE) directly in your shell. You get the body of the response as stdout, headers and more as stderr if you turn on -v, the exit code is directly related to the error code, and there's a minimal plugin architecture (with more hooks on their way.)

How do I use it?

The first thing you need to do with adenosine is to set up your environment to use it:

  $ eval $(adenosine exports)

The next thing you need to do is set the base URI. The base URI is just a URI with a * in it. So for example, why don't we start with the DuckDuckGo API. A simple, useful base URI could be set as follows:

 $ adenosine &#39;http://api.duckduckgo.com/?q=*&amp;o=json&#39;

So with that set all you need to do is:

 $ GET test | pp

The above will put test into the base URI in place of the *. pp is just a tiny json pretty printer bundled with adenosine.

If you don't specify a URI scheme (http:// or https://), your URI will be prepended with http://, and if you don't specify a * it will be appended to your URI.

GET isn't all you can do, though it's certainly what I do most often. Here's an example of how I might send a text message with our API at work:

$ adenosine &#39;http://our.api.com/api/2/*/sms&#39;
$ POST myaccount &#39;{&quot;message&quot;:&quot;Hello Frew!&quot;,&quot;destinations&quot;:[8675309]}&#39; \
&nbsp;&nbsp;-H &#39;Content-Type: application/json&#39; -H &#39;Accept: application/json&#39;

If you want to edit the data you are about to post, use adenosine's -V switch to open your $EDITOR.

There's more in the documentation, but that's basically how it works.

Too much to type!

Sometimes you'll want to set certain headers for a given host. For example, in my previous example I need to set the Content-Type and Accept headers so that my application will do the right thing. I actually always want to set those headers when interacting with my application. The way to do this nicely is to create a configuration file for my server. For example, I could create the following:

~/.resty/our.api.com:

POST -H &#39;Content-Type: application/json&#39; -H &#39;Accept: application/json&#39;
PUT -H &#39;Content-Type: application/json&#39; -H &#39;Accept: application/json&#39;
DELETE -H &#39;Content-Type: application/json&#39; -H &#39;Accept: application/json&#39;
GET -H &#39;Content-Type: application/json&#39; -H &#39;Accept: application/json&#39;

That will set those two headers for all four of the major HTTP verbs, so the previous example could now be merely:

  $ POST myaccount &#39;{&quot;message&quot;:&quot;Hello Frew!&quot;,&quot;destinations&quot;:[8675309]}&#39;

Plugins

One of the most exciting new features of adenosine (vs. resty) is that it supports plugins. I initially just wrote two: Stopwatch and Rainbow.

`Stopwatch`

Stopwatch adds timing info to the output from -v. I like to know how long various commands and requests take, especially when I am the implementor of said command. If something takes longer than 0.5s, I did a bad job. So Stopwatch gives me exactly what information I need to know. To enable it put the following in ~/.adenosinerc.yml:

plugins:
&nbsp;&nbsp;&nbsp;- ::Stopwatch

`Rainbow`

Rainbow color codes the output from -v. I really like this, but obviously it's not for everyone. At the most basic, you can enable it the same way that you enable Stopwatch, but that just gives you the most basic color coding. Rainbow is implemented to be easily themable as well as overridable. If you just wanted to override the color of the method from the request, put the following in ~/.adenosinerc.yml:

plugins:
&nbsp;&nbsp;&nbsp;- ::Rainbow: {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;request_method_color: cyan
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}

That's fine for experimentation, but I'd like to encourage everyone to make their own themes and submit them as pull requests. To make a theme, all you need to do is create a file as follows:

package App::Adenosine::Plugin::Rainbow::Halloween;

use Moo;
extends 'App::Adenosine::Plugin::Rainbow';
has '+response_header_colon_color' =&gt; (default =&gt; sub { '' });
has '+response_header_name_color'  =&gt; (default =&gt; sub { 'orange1' });
has '+response_header_value_color' =&gt; (default =&gt; sub { 'orange2' });
# ...

Rainbow uses Term::ExtendedColor, so to see what colors are available run the color_matrix script that comes with it. Also note that while in the example above only a single color is specified, the foreground, backround, and even a few other (spottily supported) attributes may be set:

has '+response_header_value_color' =&gt; (
&nbsp;&nbsp;default =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fg        =&gt; 'orange2',
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;bg        =&gt; 'cyan', # what a bad choice
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;bold      =&gt; 1,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;italic    =&gt; 1,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;underline =&gt; 1,
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
&nbsp;&nbsp;}
);

Once you've done that, your ~/.adenosinerc.yml can reference your theme directly:

plugins:
&nbsp;&nbsp;&nbsp;- ::Rainbow::Halloween
&nbsp;&nbsp;&nbsp;- ::Stopwatch

…and that's adenosine! Please play with it and let me know if you like it!

Installing `adenosine` without CPAN

Most readers of this article are likely to be comfortable installing adenosine from the CPAN, but if you don't want to use CPAN, or you somehow got to this post as a non-japh, this might be more your speed:

 git clone http://github.com/frioux/app-adenosine-prefab
 source app-adenosine-prefab/adenosine-exports

Testing networking client code using Test::LWP::UserAgent

2012-12-12T00:00:00Z

Test::LWP::UserAgent is a module I wrote after writing several networking client libraries for $work with inconsistent and spotty test coverage — what I most wanted to do was fully simulate the server end of a network connection, without having to delve deeply into LWP's internal implementation, nor mock a lot of methods, which the traditional mock object approach would require.

Exploring the options available led me to Yury Zavarin's Test::Mock::LWP::Dispatch, whose API I adapted into the initial version of Test::LWP::UserAgent. It behaves exactly like LWP::UserAgent, one of the most popular HTTP client libraries in perl, in all respects except for the portion that actually sends the request out to the network - at that point it returns the first response that you have preconfigured that matches the outbound request.

my $useragent = Test::LWP::UserAgent-&gt;new;
$useragent-&gt;map_response(qr/example.com/, HTTP::Response-&gt;new(200));

my $response = $useragent-&gt;get('http://example.com');
# prints 200
say $response-&gt;code;

$response = $useragent-&gt;get('http://google.com');
# prints 404
say $response-&gt;code;

In the above example, no outbound reqest passing through this user agent will use the live network (this is the default behaviour). Any request whose URI matches /example.com/ will receive an HTTP 200 response, and the remaining requests will return a 404.

If, however, you wish to capture only some requests, while letting the remainder use the network normally, you can enable the network_fallback feature:

my $useragent = Test::LWP::UserAgent-&gt;new;
$useragent-&gt;network_fallback(1);
$useragent-&gt;map_response(qr/example.com/, HTTP::Response-&gt;new(200));

my $response = $useragent-&gt;get('http://example.com');
# prints 200
say $response-&gt;code;

$response = $useragent-&gt;get('http://google.com');
# prints 200 … if Google is up!
say $response-&gt;code;

And indeed if you inspect the $response object returned, you will see that it contains the actual network response from contacting http://google.com.

Configuration can also be done globally, if you want all user agents in your program to use the same settings (or if you do not have direct control over the actual user agent object being used, but just its class):

Test::LWP::UserAgent-&gt;map_response(...);
my $response = Test::LWP::UserAgent-&gt;new-&gt;request(...);

Test::LWP::UserAgent inherits from LWP::UserAgent, so it satisfies isa requirements that you may have via Moose or another system that uses type checking. This means that all the normal options available in LWP::UserAgent are still available to you, and work identically, for example:

my $useragent = Test::LWP::UserAgent-&gt;new(
&nbsp;&nbsp;&nbsp;&nbsp;timeout =&gt; 10,
&nbsp;&nbsp;&nbsp;&nbsp;cookie_jar =&gt; { file =&gt; &quot;$ENV{HOME}/.cookies.txt&quot; },
);

You can also use Test::LWP::UserAgent to connect to a local PSGI application seamlessly, which can be very useful when you have a client and server installed on the same box but do not want to fuss with separate code for handling this case, or if you want more fine-grained control over what responses to send:

my $app = Plack::Util::load_psgi('./myapp.psgi');
$useragent-&gt;register_psgi('mytestdomain.com', $app);
my $response = $useragent-&gt;request(...);

CODE EXAMPLES

The code examples above are fleshed out as fully-working code in the examples/ directory under http://metacpan.org/release/Test-LWP-UserAgent, along with a detailed example of some unit tests for a hypothetical networking client library.

Making a list and checking it twice

2012-12-11T00:00:00Z

Poor Santa. It's getting close to Christmas, and he still hasn't made his list of CPAN authors yet! Who will get coal? Who will get a present?

With the clock ticking, he decides to automate his list building with Perl, but still needs to check it twice.

Santa doesn't need a GUI, he likes the terminal just fine, so he's going to write a program to decide if the CPAN authors are naughty or nice and then prompt him to confirm each one.

How should he prompt? He likes the nice simple prompt of ExtUtils::MakeMaker, should he use it?

    $ perl -MExtUtils::MakeMaker -wE &#39;prompt(&quot;Naughy?&quot;, &quot;yes&quot;)&#39;
    Naughty? [yes]

Loading all of ExtUtils::MakeMaker just to get a prompt is just a little bit gross, so maybe he can do better. Next stop… CPAN!

So many prompting modules to choose from… (sigh)… the usual problem with CPAN. But wait, look! A "Tiny" module, and it works exactly like ExtUtils::MakeMaker… IO::Prompt::Tiny.

Santa decides to give it a try:

use v5.14;
use warnings;

use File::Slurp qw/write_file/;
use IO::Prompt::Tiny qw/prompt/;
use ORDB::CPANUploads;
use Time::Piece;

# Find each CPAN author's latest release date
# If they didn't release in 2012, let's call them naughty
my $results = ORDB::CPANUploads-&gt;selectall_arrayref(
&nbsp;&nbsp;'select author, max(released) from uploads group by author',
);

# Make a list
my %list;

for my $author (@$results) {
&nbsp;&nbsp;my $id   = $author-&gt;[0];
&nbsp;&nbsp;my $date = gmtime( $author-&gt;[1] );

  # Check it once
  my $prompt = sprintf( &quot;\n%-9s released on %s.  Naughty? (yes/no)&quot;, $id, $date );
&nbsp;&nbsp;my $ans = prompt( $prompt, $date-&gt;year &lt; 2012 ? &quot;yes&quot; : &quot;no&quot; );
&nbsp;&nbsp;$list{$id} = ( $ans =~ /^y/i ? &quot;naughty&quot; : &quot;nice&quot; );

  # Check it twice
  my $check = prompt( &quot;Are you sure $id is $list{$id}? (yes/no)&quot;, &quot;yes&quot; );
&nbsp;&nbsp;redo unless $check =~ /^y/i;
}

write_file(
&nbsp;&nbsp;'list.txt',
&nbsp;&nbsp;{ atomic =&gt; 1 },
&nbsp;&nbsp;map { &quot;$_ is $list{$_}\n&quot; } sort keys %list
);

Excellent, now he's ready to make his list and find all those naughty CPAN authors who haven't released in 2012:

    $ perl naughty-or-nice.pl

    AADLER    released on Sun Feb 13 20:32:59 2011.  Naughty? (yes/no) [yes] 
    Are you sure AADLER is naughty? (yes/no) [yes] 

    AAKD      released on Wed Nov  9 06:26:08 2005.  Naughty? (yes/no) [yes] 
    Are you sure AAKD is naughty? (yes/no) [yes] 

    AAKHTER   released on Thu Nov 10 03:18:33 2005.  Naughty? (yes/no) [yes] 
    Are you sure AAKHTER is naughty? (yes/no) [yes] 

    AALLAN    released on Fri Nov 17 20:44:54 2006.  Naughty? (yes/no) [yes] 
    Are you sure AALLAN is naughty? (yes/no) [yes] 

    AANOAA    released on Sat Feb 12 09:10:34 2011.  Naughty? (yes/no) [yes] 
    Are you sure AANOAA is naughty? (yes/no) [yes] 

    AAR       released on Sun Mar 11 19:50:00 2012.  Naughty? (yes/no) [no] 
    Are you sure AAR is nice? (yes/no) [yes] 

    AARDEN    released on Wed Apr 30 16:19:59 2003.  Naughty? (yes/no) [yes] 
    Are you sure AARDEN is naughty? (yes/no) [yes]

    &#x22EE;

    ZWON      released on Thu Sep  1 15:21:14 2011.  Naughty? (yes/no) [yes] yes
    Are you sure ZWON is naughty? (yes/no) [yes] yes

    ZZZ       released on Tue Aug 16 17:05:53 2011.  Naughty? (yes/no) [yes] yes
    Are you sure ZZZ is naughty? (yes/no) [yes] yes

After bouncing on the return key almost 12,000 times, Santa is ready for Christmas on CPAN!

The Greatest Tradition of All

2012-12-10T00:00:00Z

The Holidays. No time in the calendar can boast a higher density of traditions per diem than this glorious tail end of the year. But amidst this gleefull storm of traditions dealing with the culinary, the jocose, the social, one particularly stands out for being almost universally observed.

I'm talking about the hallowed ritual of returning presents.

This problem, funnily enough, isn't restricted to packages that come with bows and glitter. Perl distributions, historically, don't come with a return policy. That is, once distribution Foo-Bar is installed, it stays installed. Like aunt Thelma's pan-chromatic acrylic sweater, it would linger at the back of the closet -- dusty, unused and untouched by more-discriminating-than-that moths, an immutable testament that taste isn't a trait passed through genetic lines. Which usually isn't such a big problem: computers usually have deep closet spaces.

When Gifts Turn to Time-Bombs

There are, however, a few cases where that lingering can cause issues. The most common is when those ghosts of installation pasts interfere with the current festivities.

For example, if version 1 of the distribution Holidays-Activities contains

    /lib/Holidays/Activities/SingSongs.pm
    /lib/Holidays/Activities/BeMerry.pm
    /lib/Holidays/Activities/MeetFriends.pm
    /lib/Holidays/Activities/DressUpAsElf.pm

and version 2, after realising that tights, bell-adorned curly slippers and short skirts aren't for everybody, revises its payload to be:

    /lib/Holidays/Activities/SingSongs.pm
    /lib/Holidays/Activities/BeMerry.pm
    /lib/Holidays/Activities/MeetFriends.pm
    /lib/Holidays/Activities/DressUpAsSanta.pm

then somebody first installing version 1, and then version 2, will end up with some of last year's leftovers, so to speak:

    /lib/Holidays/Activities/SingSongs.pm
    /lib/Holidays/Activities/BeMerry.pm
    /lib/Holidays/Activities/MeetFriends.pm
    /lib/Holidays/Activities/DressUpAsElf.pm
    /lib/Holidays/Activities/DressUpAsSanta.pm

Yet, this is still not guaranteed to cause problem. If the Holidays-Activities modules are called piecemeal, nobody with a lick of good sense will ever call use Holidays::Activities::DressUpAsElf, and everybody's eyeballs will be spared. But if, say, Holidays::Activities use a snazzy module like Module::Pluggable to find all the activities one needs to do during those merry times:

package Holidays::Activities;

use Module::Pluggable
&nbsp;&nbsp;&nbsp;&nbsp;search_path =&gt; ['Holidays::Activities'],
&nbsp;&nbsp;&nbsp;&nbsp;require     =&gt; 1;

sub celebrate {
&nbsp;&nbsp;&nbsp;&nbsp;my $self = shift;

&nbsp;&nbsp;&nbsp;&nbsp;$_-&gt;perform($self) for $self-&gt;plugins;
}

...

BAM! We end up with a user sporting tights, a cheerfully rotund pot-belly, lots of facial hair and a pointy nose. That sound you're hearing in your head? That's the screams of millions of children for whom the Holidays will never be the same.

Ghost Of Installs Yet To Come, Give Me a Chance!

Fortunately, all is not lost. As it happens, most modern CPAN clients provide a little-known-yet-handy stash of Pepto-Bismol alongside the milk and cookies: the installation of a distribution will also include a .packlist file listing all installed files (the modules, the scripts, the manpages, the whole deal). For example, for a Dist::Zilla installed in /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla.pm, a .packlist will be created in /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/x86_64-linux/auto/Dist/Zilla/.packlist and will look like:

    /opt/perlbrew/perls/perl-5.16.1/bin/dzil
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App/Command.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App/Command/add.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App/Command/authordeps.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App/Command/build.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App/Command/clean.pm
    /opt/perlbrew/perls/perl-5.16.1/lib/site_perl/5.16.1/Dist/Zilla/App/Command/install.pm
    ... and so on, and so forth ...

Word of caution: there are some instances where those .packlist will be missing. Some Linux distributions, belonging squarely on the naughty list, don't include them in their packaging of perl distributions. It's also possible, if you still celebrate the Saturnalia and run on a pre-modern era perl, that you won't have them.

Are the Batteries Included With That Toy?

Partially. If one dig deeps in the lore of Module::Build, one would find an almost-undocumented --uninst option that stands for exactly what you think. So it's possible to do, during a manual installation dance:

    perl Build.PL
    ./Build test
    ./Build install --uninst 1

ExtUtils::MakeMaker also offer a similar command:

    perl Makefile.Pl
    make test
    make UNINST=1

As was discussed recently, however, both systems seem to have issues. And, beside, while a mechanism is offered, it puts the onus of cleaning the leftover of past parties to the user. Manual cleaning, to boot. Hardly optimal.

We Need the Help of Elves (just skip the tights, please)

Fairly recently, Module::Build::CleanInstall made its appearance to help with this problem. This module is a subclass of Module::Build that simply adds an automatic uninstallation of previous versions of a distribution (provided that the .packlist file is found) before doing the new install. Its usage couldn't be simplier: change all mention of Module::Build to Module::Build::CleanInstall in the Build.PL:

use strict;
use warnings;

use Module::Build::CleanInstall;

Module::Build::CleanInstall-&gt;new(
&nbsp;&nbsp;&nbsp;&nbsp;dist_name =&gt; &quot;Holidays-Activities&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;... # same as Module::Build
)-&gt;create_build_script;

and you're good to go (just don't forget to add Module::Build::CleanInstall as a build dependency in your distribution META.json or META.yml).

A different solution has also emerged for the special case of File::ShareDir-based shared files. Instead of trying to remove past files, this approach proposes to bundle all shared files in a single tarball. As the name of the tarball remains the same from one version to the next, one is thus assured that the new file will always clobber its previous incarnation. And the space saving brought by the compression could easily be seen as a nice dollop of whipped cream atop an already appealing hot cocoa. Of course, for the author, dealing with the tarballing of the shared files at release time, and handling their uncompression in the code is one more fixing that must be added to the already extensive work-feast that is release management. And that is why two modules have been created to help there: Dist::Zilla::Plugin::ShareDir::Tarball auto-tarballs share directories at release time (provided that you use Dist::Zilla, natch):

; in your dist.ini file

[ShareDir]
[ShareDir::Tarball]

and File::ShareDir::Tarball takes care of that pesky extraction business for you, and transparently provides the user with the extracted temporary directory/files.

use File::ShareDir::Tarball ':all';

my $dir = dist_dir('Holidays-Activities');
# $dir is now the path to a temporary directory holding
# the extracted content of the shared files tarball

Aunts Will Be Aunts, Garish Sweaters Will Still Pop Up

... but at least now we have a few more venues to quietly deal with those presents once the party is over. And this is good, for a sage once said

   &quot;To receive is pleasure, and to give is higher pleasure still. But to
   give back and get what you really wanted all along with a minimum of fuss
   and without waiting in line with your coat on and screaming brats right
   behind you for hours on end is the greatest pleasure of
   them all.&quot;

Or something like that. I might be paraphrasing.

Fixing the regexep code block facility

2012-12-09T00:00:00Z

The /(?{})/ code-execution facility was added to regular expressions back in 1998, in the 5.005 release. Since then it's been sitting there, marked experimental, until in 2012, the implementation was completely re-written for the 5.17.1 release.

The way it originally worked was that during regex compilation, if an opening (?{ was seen, the balancing } was found, and the text between the braces was passed to perl's internal eval mechanism. But after compiling the code, the execution was skipped, and instead the optree and pad of the compiled eval were saved and attached to the regexp object. Later when the regexp was being executed, the current pad would be set to the saved pad, and the ops in the optree called.

So, what's wrong with that?

Well, everything really.

First, at the most trivial level, the code isn't properly parsed, so something like /(?{ $x = '{' })/ is an error, due the simplistic counting of balancing braces. This is in contrast to something like "foo$hash{ $x ? '{' : '[' }bar", where the expression for the hash index doesn't require balanced braces.

/(?{ $x = '{' })/ # &lt;-- was an error

&quot;foo$hash{ $x ? '{' : '[' }bar&quot; # &lt;-- not an error

So the first change was to integrate the parsing of the code blocks with the parsing of the surrounding Perl code, at least for literal regexes.

The second big issue was that by just saving the pad and resurrecting it from time to time, lexicals at best did the wrong thing, and at worst caused segfaults. In particular, the behaviour of closures didn't match reasonable expectations. For example, this code:

for my $i (0..2) {
&nbsp;&nbsp;&nbsp;&nbsp;push @r, qr/^(??{$i})$/;
}
print &quot;ok 0\n&quot; if &quot;0&quot; =~ $r[0];
print &quot;ok 1\n&quot; if &quot;1&quot; =~ $r[1];
print &quot;ok 2\n&quot; if &quot;2&quot; =~ $r[2];

prints out three ok's now, but formerly printed nothing. It works because in terms of pads, closures, etc, these:

/A(?{B})C/;
$r = qr/A(?{B})C/;

(where B is a block of code) are now parsed (in terms of lexicals) roughly as:

/A/ &amp;&amp; do {B} &amp;&amp; /C/;
$r = sub { /A/ &amp;&amp; do {B} &amp;&amp; /C/ };

That is, in the first line, the code block shares the same pad as the surrounding code, while in the second example it uses the pad of a hidden anonymous sub, which is cloned anew on each call to qr//. This makes it all Just Do the Right Thing. qr// constructs that contain arbitrary code now act like closures.

However, Perl also supports patterns that are determined at runtime, or which contain a mixture of compile- and runtime patterns, such as

my $pat = 'C(?{D})';
use re 'eval';
/A(?{B})-$pat/;

Formerly, as the run-time pattern was being assembled, any bits of literal code (such as the B above) would be recompiled, destroying any closure information. Now, such code snippets are preserved, and only the non-literal bits are compiled. Similarly where regexp objects are included within a larger pattern:

my $re = qr/C(?{D})/;
use re 'eval';
/A(??{B})-$re/;

Although the text of the $re pattern is interpolated and recompiled, any code blocks within $re are not recompiled.

Finally, because pads are handled properly now, things don't go awry during recursion:

# test-recurse-regex.pl
sub recurse {
&nbsp;&nbsp;&nbsp;&nbsp;my ($n) = @_;
&nbsp;&nbsp;&nbsp;&nbsp;return if $n &gt; 2;
&nbsp;&nbsp;&nbsp;&nbsp;print &quot;ok\n&quot; if &quot;A$n&quot; =~ /^A(??{$n})$/;
&nbsp;&nbsp;&nbsp;&nbsp;recurse($n+1);
}
recurse(0);

…and then…

$ perl test-recurse-regex.pl
ok
ok
ok

There were lots of other subtleties involved, but those are the ones I can think of off the top my head. These bugs made the entire (?{}) and (??{}) features unreliable in earlier perls, but with the upcoming perl 5.18 release, it should work sanely and predictably!

Atomic Gift Wrapping

2012-12-08T00:00:00Z

Gift wrapping. You rarely want to do it by the Christmas Tree because you know, you just know that just when you're in the middle of wrapping that huge Dino-Rampage Total Combat Battlezone box, one of your child processes will innocently try to access the living room resources, see the half-ready present, go totally ape-boinker, and make every single day till Christmas a grueling hell. So instead, you usually wrap the presents in some dark corner of the house — usually the attic, the shed, or the secret room you built specifically for that purpose. Then, when the pile o' prezzies is ready, you carefully peek out in the hallway, make sure there is no living being in sight, rush down the stairs holding the loot with every prehensible limb available, and dump the whole thing under the tree. When the amazed kids ask how the glittery boxes got there? Well… Magic.

Atomic file writing? Exactly the same thing.

Most of the time, you can write files at your leisure, but sometimes you have other programs that can access it at any time, and you don't want them to end up bits and pieces. So you either begin to play with locking the file ("No one enter the room until I'm done!") or you write the new file in a different place, and when everything is ready, you do a quick switcheroo where you replace the old copy with its new incarnation. There is still a window where things can go wrong, but it's a minimal one.

Of course, the switcheroo is much easier (and funnier) to say than to do. That is, unless you use File::AtomicWrite, which takes care of all the nitty gritty details for you. To wit:

use File::AtomicWrite;

my $present_list = File::AtomicWrite-&gt;new({ file =&gt; '/etc/wishlist' });

my $fh = $present_list-&gt;fh;

say {$fh} 'For Xmas, I want:';

say {$fh} 'A pony';

# snip 10,000 lines

say {$fh} 'And a Rocket-Raptor Sky Armageddon action figure';

$present_list-&gt;commit;

Or, alternatively:

use File::AtomicWrite;

File::AtomicWrite-&gt;write_file({
&nbsp;&nbsp;file  =&gt; '/etc/wishlist',
&nbsp;&nbsp;input =&gt; \$all_i_want,
});

To accomodate different levels of paranoia, the module allows for several options, including the directory in which the temporary file is written, the template for the name of said temporary file, optional checksum of the data to be written, the minimal size that data should be (wishlist under 50K? IMPOSSIBLE!), and much, much more. But, and this is the good news, all the hard stuff has been taken off your plate. Now all you need to do is to write to the file (easy) or, in the real world equivalent, find the gift that needs to be appended under the tree (… okay, maybe not that easy).

Is your code… Safe?

2012-12-07T00:00:00Z

Today we'll have a little chat about the Safe module. What does it do, how does it work and when to use it?

The Purpose

Safe's purpose is to provide a restricted eval() function to perl, which will function as the regular eval(STRING) built-in, except in two important points:

This restricted eval() will refuse to compile certain built-ins (the list being customizable, so for example you can prevent compilation of all filesystem access functions, or just some, or none)
Moreover, it will compile code in a separate, quarantined namespace, where the data of your main program will not be accessible.

The Example

Quick, a code example, to see what it looks like:

use v5.14.0;
use warnings;
use Safe;

# create a Safe compartment
my $compartment = Safe-&gt;new;
$compartment-&gt;deny(qw(:base_loop));

$_ = 2;

# First try
my $result = $compartment-&gt;reval( q{ 40 + $_ } );
defined $result or die &quot;Safe compilation error: $@&quot;;
say $result;

# Second try
$result = $compartment-&gt;reval( q{ $_++ for 1..40 } );
defined $result or die &quot;Safe compilation error: $@&quot;;
say $result;

In this example we start by creating a fairly restricted Safe compartment, where not only the default set of built-ins is forbidden, but also all loop built-ins (for, while, etc.)

The result of the first try will be 42, since the addition and the variable fetching are still permitted operations. The second try will fail with the error message:

  &#39;foreach loop entry&#39; trapped by operation mask

The Basics

There are a couple of points worth noting even in such a small example.

First, we use the deny() method to deny more operations than the default set. Safe provides deny(), permit(), deny_only() and permit_only() to customize this set more finely; you can pass to those methods lists of individual op names (as known to the perl internals) or handy predefined bundles (like :base_loop). Those bundles are listed in the Opcode man page.

Secondly, just like eval(), a compilation error reported by the reval() method will be in the $@ variable.

Thirdly, we used $_ in the string we've been reval-ing. But the namespaces were supposed to be separated? Did I lie? Of course not, I did not lie, and this isn't a bug in Safe either. The fact is that $_ is one of the few variables that are shared by default between the program's global namespace and the compartment's one.

To verify this, change the "first try" lines to this, and observe how the $result will now be 40 instead of 42:

our $x = 2; # or &quot;my $x = 2&quot;, does not matter
my $result = $compartment-&gt;reval( q{ 40 + $x } );

Of course, the list of variables you want to share can be changed, too:

our $x = 2;
$compartment-&gt;share('$x');
my $result = $compartment-&gt;reval( q{ 40 + $x } );
say $result; # will now say '42'

You can specify that you want to share functions as well, so the reval-ed code will be able to call them. By default, Safe will share the *_ glob (so, $_, @_, etc.) and a quite long list of built-in functions that are often called behind the scenes (like &UNIVERSAL::isa or &utf8::downgrade). Use the Source for the full list, which is perl-version-dependent.

The Details

So what exactly is this new namespace that Safe is masking main:: under? The root() method allows you to access it, as in the following example:

use v5.14.0;
use warnings;
use Safe;

my $compartment = Safe-&gt;new;
my $root = $compartment-&gt;root;

say &quot;root namespace name: $root&quot;;

my $result = $compartment-&gt;reval( q{ $x = 42 } );
say &quot;result = $result&quot;;

no strict 'refs';
say &quot;safe's \$x : &quot;, ${ $root.'::x' }; # 42 too !

On My Machine the root() method will return the string Safe::Root0 in this example. So consequently the variable introduced as $x in the evaluated code will be known as $Safe::Root0::x in the outer program.

Also, you'll notice that $x has been compiled by Safe without fussing about Global symbol "$x" requires explicit package name: the ambient pragmas are not passed to the reval(). If you want to enforce strictures in the compilation phase, you have to call reval() with a second boolean parameter set to true:

    my $result = $compartment-&gt;reval( q{ our $x = 42 }, 1 );

The Lengths

As you can imagine a popular game is to get Safe execute code that it shouldn't. Safe goes to some lengths to avoid this. Here are two of those, just to excite your imagination:

Destructor destruction. Before exiting from a reval(), Safe will check whether any class gained new methods, and if so, it will delete every DESTROY and AUTOLOAD it finds under its root namespace. This is to prevent destructors or functions created inside the department from being run outside of it (for example if the reval() returns to its caller a newly crafted object).

Closure closing. Safe provides a method wrap_code_ref() that will take a code reference as an argument, and return a version of it wrapped in a reval() (that's the short story -- check the source for the gory details). Subsequently, reval() will check its return values for any code references (recursively, if it returns hash or array references), and will invoke wrap_code_ref() on any code reference found there before passing them to you.

The Caveats

TL;DR: No silver bullet, etc.

Longer version (but really it's just common sense): the name of the Safe module is misleading. If should have been called Restricted::Sortof or something. It has its uses, but making evaluating foreign code safer is not one of these. Even in a very restricted compartment, it's possible to introduce a pathologically slow regular expression, or a pathologically long loop, or a pathologically big string. Any use of Safe for serious security purposes is basically misguided.

Checking Out Your Data Structures

2012-12-06T00:00:00Z

Have Your Variables Been Naughty or Nice?

Every once in a while you expect your variables to contain a certain value, only to realize, sometimes a bit too late, that something's off. We've all been there, and having a way to quickly and neatly view the contents of your variables can make all the difference in the world.

Enter Data::Printer, a module that formats and prints your data structures on screen, in a way that lets you easily check them and spot errors. Its output is colored by default, and it also contains several filters to help you debug objects.

Using it couldn't be simpler: Data::Printer exports a p() function to your namespace that you use to dump your data to STDERR (or anywhere else in fact):

use Data::Printer;

...

p $some_variable;

Since it's a debugging module you'll likely be turning it on and off everywhere in your code. If that's the case, a common idiom is to simply add this line when you need to check some data:

use DDP; p $some_variable;

Which takes advantage of DDP, a shorter alias for Data::Printer.

Optimized for Humans

Now, if you were using it to view the content of a complex data structure, this is what you might get:

\ [     [0] "/path/from/env" (TAINTED),     [1] [         [0] "foo",         [1] "bar"     ],     [2] {         name     "",         gifts    var[1]     },     [3] \ "Some string reference" (weak) ]

Did you see what just happened? Not only did Data::Printer show you the contents of your variable in a clear, colored and indented fashion, it also let you see array indices, know about circular references, tainted data, and weak references, and it can detect and describe many more facts about your data!

But enough about plain data structures. Let's try it with an object:

package My::Class {
&nbsp;&nbsp;sub new {
&nbsp;&nbsp;&nbsp;&nbsp;my $class = shift;
&nbsp;&nbsp;&nbsp;&nbsp;return bless { num =&gt; 42 }, $class;
&nbsp;&nbsp;}

&nbsp;&nbsp;sub foo {}
&nbsp;&nbsp;sub bar {}
&nbsp;&nbsp;sub _baz {}
};

If you use Data::Printer on an instance of the class defined above, you'll see something like this when you dump it:

My::Class {     public methods (3) : bar,foo,new     private methods (1) : _baz     internals: {         num     42     } }

Pretty neat, huh? It would even show inheritance if we had any =)

Filters

Another of Data::Printer's strenghts lies in how it lets you easily filter Perl types and classes. The basic distribution includes formatters for some popular modules like DateTime, Digest and DBI, so if you have enabled them in your settings, then this:

my $data = {
&nbsp;&nbsp;&nbsp;&nbsp;datetime =&gt; DateTime-&gt;new( year =&gt; 2012, month =&gt; 12, day =&gt; 25 ),
&nbsp;&nbsp;&nbsp;&nbsp;dbh      =&gt; DBI-&gt;connect($dsn, $user, $pass),
&nbsp;&nbsp;&nbsp;&nbsp;digest   =&gt; Digest::MD5-&gt;new,
};

use DDP; p $data;

Might show you something like this:

\ {     datetime    2012-12-25T00:00:00 [floating],     dbh         mysql Database Handle (connected) {         database: mydb         Auto Commit: 1         Statement Handles: 0         Last Statement: -     }     digest    d41d8cd98f00b204e9800998ecf8427e [reset], }

Notice how, in this example, Data::Printer showed you:

DateTime objects as formatted strings with the timezone (in this case, 'floating');
Database handles with information regarding connection, current database, amount of active statement handles, last run statement and even extra properties that might influence your program (like AutoCommit);
Digest objects (like Digest::MD5) as formatted hexdumps, including a mention when, like in the example above, the digest is actually is the one of a reset (empty) element.

There are several different filters available on CPAN and you can make some new ones yourself!

In Short

Data::Printer is a shiny tool for your Perl utility belt that not only pretty-prints variables, but also provides very useful information regarding your data. It is also extremely easy to tweak to suit your own taste and debugging needs, from colors to formatting to new filters.

If you're not using it already, give it a go!

Reindeer Games

2012-12-05T00:00:00Z

Dancer is a Perl micro-web framework originally based on Ruby's Sinatra. Its status as a micro-framework means, amongst other things, that it is a very nimble creature, and can be thus showed many nifty tricks.

Have Your Party MC'ed by One of Santa's Close Collaborator

It doesn't take a lot of boilerplate to create a minimalistic web service with Dancer. For example, want to have a web service returning a random song in the current directory to help the party's DJ? There we go:

#!/usr/bin/perl
use strict;
use warnings;

use autodie;

use Dancer;

use List::Util qw/ shuffle /;
use MP3::Info;

get '/song' =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;opendir my $dir, '.';

&nbsp;&nbsp;&nbsp;&nbsp;my ( $file ) = shuffle grep { /\.mp3$/ } readdir $dir;

&nbsp;&nbsp;&nbsp;&nbsp;my $song = MP3::Info-&gt;new($file);

&nbsp;&nbsp;&nbsp;&nbsp;return sprintf &quot;%s (%s)&quot;, $song-&gt;title, $song-&gt;artist;
};

dance;

The result is bare-bone, but oh-so-wonderfully functional:

    $ perl next_song.pl
    &gt;&gt; Dancer 1.311 server 6046 listening on http://0.0.0.0:3000
    == Entering the development dance floor ...

    # meanwhile, on the DJ side
    $ curl http://localhost:3000/song
    01 A Tap Dancer&#39;s Dilemma (DIABLO SWING ORCHESTRA)

Shorter!

Can we make this shorter? You bet we can. For example, to be able to create mini-web services that only answer to their index route, we can create the module C.pm:

package C;

use Dancer;

{ package ::main; use Dancer ':syntax'; }

END {
&nbsp;&nbsp;&nbsp;&nbsp;get '/' =&gt; \&amp;::index if defined &amp;::index;
&nbsp;&nbsp;&nbsp;&nbsp;dance;
}

1;

And voilà, make room for your new MC:

    $ perl -MC -MMP3::Info -e&#39;sub index{@s=&lt;*.mp3&gt;;$s[rand@s];}&#39;
    &gt;&gt; Dancer 1.311 server 6046 listening on http://0.0.0.0:3000
    == Entering the development dance floor ...

    # meanwhile, on the DJ side
    $  curl http://localhost:3000/
    01 - 01 A Tap Dancer&#39;s Dilemma.mp3

Even shorter!

Can we make this even shorter? Maybe slurp on simple scripts present in sub-directories and convert them into web service routes? My, but of course:

package C;

use 5.10.0;

use Dancer;
use Path::Class qw/ dir /;

{ package ::main; use Dancer ':syntax'; }

dir('.')-&gt;traverse( sub{
&nbsp;&nbsp;&nbsp;&nbsp;my( $child, $cont ) = @_;

&nbsp;&nbsp;&nbsp;&nbsp;return $cont-&gt;() if -d $child;

&nbsp;&nbsp;&nbsp;&nbsp;my( $route, $method ) = $child =~ /^(.*)\.(get|post|put)$/ or return;

&nbsp;&nbsp;&nbsp;&nbsp;$route =~ s/^\.// or $route =~ s#^#/#;
&nbsp;&nbsp;&nbsp;&nbsp;$route =~ s/SPLAT/*/g;

&nbsp;&nbsp;&nbsp;&nbsp;say &quot;adding route '$route'&quot;;

&nbsp;&nbsp;&nbsp;&nbsp;eval &lt;&lt;&quot;END_ROUTE&quot;;
$method '$route' =&gt; sub {
&nbsp;&nbsp;&nbsp;&nbsp;\@_ = \@ARGV = splat;
&nbsp;&nbsp;&nbsp;&nbsp;local \*STDOUT;
&nbsp;&nbsp;&nbsp;&nbsp;open STDOUT, '&gt;', \\my \$output;

&nbsp;&nbsp;&nbsp;&nbsp;{ @{[ $child-&gt;slurp ]} };

&nbsp;&nbsp;&nbsp;&nbsp;return \$output;
};
END_ROUTE

&nbsp;&nbsp;&nbsp;&nbsp;die &quot;couldn't compile route '$child': $@&quot; if $@;
});

END { dance; }

1;

And with that MC on steroid, we can now have auto-web-serviceable perl scripts:

    $ cat song.get
    use List::Util qw/ shuffle /;
    use MP3::Info;

    opendir my $dir, &#39;.&#39;;
    my ( $file ) = shuffle grep { /\.mp3$/ } readdir $dir;
    my $song = MP3::Info-&gt;new($file);

    printf &quot;%s (%s)&quot;, $song-&gt;title, $song-&gt;artist;

    $ cat request/SPLAT.put
    use 5.10.0;

    open my $request_fh, &#39;&gt;&gt;&#39;, &#39;requests&#39;;

    my $song = shift;

    say {$request_fh} $song;

    say &quot;song &#39;$song&#39; added to the list&quot;;

which can be used from the command-line:

    $ perl song.get
    01 A Tap Dancer&#39;s Dilemma (DIABLO SWING ORCHESTRA)

    $ perl request/SPLAT.put &quot;Rudoplh The Red-Nosed Reindeer&quot;
    song &#39;Rudoplh The Red-Nosed Reindeer&#39; added to the list

and, once the MC is roused,

    $ perl -MC -e1
    adding route &#39;/request/*&#39;
    adding route &#39;/song&#39;
    &gt;&gt; Dancer 1.311 server 6586 listening on http://0.0.0.0:3000
    == Entering the development dance floor ...

the Holidays are suddenly all Web 2.0-ified:

    $ curl http://localhost:3000/song
    01 A Tap Dancer&#39;s Dilemma (DIABLO SWING ORCHESTRA)

    $ curl -X PUT http://localhost:3000/request/Rudolf_the_red_nosed_reindeer
    song &#39;Rudolf_the_red_nosed_reindeer&#39; added to the list

Which Library Broke?

2012-12-04T00:00:00Z

The CPAN is built on a few key principles that make it such a success. One of them is that you don't have to write every part of the library you want to write. You can declare that it depends on some other preexisting library, and that library will be found and installed. Still, there's sometimes a cultural pressure to require as few prerequisites as to require as old a version as possible. In other words, it's often preferable to avoid making anybody install or upgrade anything. On the other hand, it's also frowned upon to put an upper limit on versions. (Actually, it's technically difficult to specify a bounded range, so those frowns are redundant and maybe I just made them up.)

Keeping track of what versions you actually need, then, often works like this:

declare that you need version 0
wait for somebody to tell you things aren't working
figure out what prerequisite library versions are unusual
ask the user to upgrade or downgrade libraries until it works
adjust the listed prequisites

The problem here is #3. It's a pain for the reporting user to go compile his installed versions, and it's a pain to do it yourself for comparison. On the bright side, the CPAN Testers, who can never be sufficiently thanked for their contributions to the CPAN, usually submit reports that include all the data needed here. You can look at almost any random report and find the prerequisites listed, both as they were requested and as they were found. That's great, if you're getting reports from CPAN smoke bots. It doesn't cover everybody else.

For everybody else, the trick is to write a little program that prints out a table like the one in the tester's report: for each prereq, it shows the requested and found version of the module. A program to do that is simple, but sort of a tedious pain to write. As with almost anything useful but boring, there's a Dist::Zilla plugin to do it for you.

Actually, there are several.

Dist::Zilla::Plugin::ReportVersions::Tiny spits out a 000-report-versions-tiny.t file like the one linked to above. One of the best things about that plugin is that it generates a test program with no extra prerequisites. It produces a simple, useful program that you can bundle along with your distribution, and whose output will be included, by default, with any test run, and there's no "cost" in more prereqs for the installing user.

Dist::Zilla::Plugin::Test::ReportPrereqs does much the same job, but lets you tweak the list a bit to add other libraries you know you care about that aren't otherwise in your module list — but it doesn't include the requested version in its output, and only runs if the AUTOMATED_TESTING environment variable is set.

Even if the reporting user isn't sending a "real" CPAN Testers report, it's easy to get the table you need. Just ask the user to run:

  $ perl -Mblib t/000-report-versions-tiny.t | nopaste

(Everybody has nopaste installed, right?)

Sleigh Upgrade

2012-12-03T00:00:00Z

Santa's little helpers had just completed an upgrade to his fleet of sleighs. Ruldoph was getting tired of lighting the way each night, and wanted a year off, so they'd finally fitted some headlights. They'd also updated their code to turn them on and off again automatically (so the batteries didn't wear out) whenever Santa took off:

sub fly_to_next_house {
&nbsp;&nbsp;my $self = shift;

&nbsp;&nbsp;$self-&gt;sleigh-&gt;lights(1);

&nbsp;&nbsp;$self-&gt;gps-&gt;set_destination( shift(@{ $self-&gt;nice_list })-&gt;address );
&nbsp;&nbsp;$self-&gt;sleigh-&gt;fly_to_destination( $self-&gt;gps );

&nbsp;&nbsp;$self-&gt;sleigh-&gt;lights(0);
}

"I like it," said Santa, "but what happens though if the GPS doesn't know where the address is?"

"Hmmm", said the wise old elf, "well, it is running Apple Maps so I guess there might be a problem". "But", he continued, "not to worry. The set_destination method throws an exception and there's some terribly complicated code that catches it and deals with it in the routine that calls fly_to_next_house."

"Ha! The elves just look it up on Google Maps you mean.", Santa laughed, "Though that's not what I'm on about. Look: If the GPS throws an exception, can't you see the lights never get turned off because the code to do so won't be executed?"

"Oh crumbs", the elf conceded, "well, I guess we could use a localized variable to set the lights. Those are automatically unset at the end of the current scope no matter what - even if you do exit by an exception!"

sub fly_to_next_house {
&nbsp;&nbsp;my $self = shift;

&nbsp;&nbsp;local $Sliegh::Lights = 1;

&nbsp;&nbsp;$self-&gt;gps-&gt;set_destination( shift(@{ $self-&gt;nice_list })-&gt;address );
&nbsp;&nbsp;$self-&gt;sleigh-&gt;fly_to_destination( $self-&gt;gps );
}

Santa stroked his beard for a few minutes. Then he shook his head. "No, that's not going to work. For a starters you've mistyped 'Sleigh,' and since there's no error with misspelled fully qualified variables, the sleigh will end up flying me in the dark! Even if you fix that, this isn't going to work with one single variable controlling all the lights on every one of my sleighs."

"Oh, good point, "how about this then?"

sub fly_to_next_house {
&nbsp;&nbsp;my $self = shift;

&nbsp;&nbsp;$self-&gt;sleigh-&gt;turn_lights_on_till_end_of_scope;

&nbsp;&nbsp;$self-&gt;gps-&gt;set_destination( shift(@{ $self-&gt;nice_list })-&gt;address );
&nbsp;&nbsp;$self-&gt;sleigh-&gt;fly_to_destination( $self-&gt;gps );
}

"That's great! How did you do that?"

"Well, that's why they call me the Wise Old Elf"

package Sleigh;
...

use Scope::Upper qw(reap UP);

sub turn_lights_on_till_end_of_scope {
&nbsp;&nbsp;my $self = shift;

  # turn on the lights
  $self-&gt;lights(1);

  # and turn them off again we exit our caller's scope
  reap { $self-&gt;lights(0) } UP;
}

My Favorite Pies

2012-12-02T00:00:00Z

I like pie.

I prefer pie to cake, and within the realm of pies, I have a few favorites. Almost certainly, my favorite pie is pumpkin pie. When I learned that it's primarily an American dessert, and had a few Brits tell me that making something sweet from pumpkin sounded awful… well, I was pretty broken up about those poor lost souls.

Pumpkin pie isn't much of a Christmas treat, though. At Christmas, I might be more likely to get a slice of chess pie. Chess pie is even more American, and mostly found in the South. It's pretty much eggs, sugar, more sugar, and vinegar. Some people call it "vinegar pie." Trust me, it's better than it sounds.

Chess pie is good stuff, but I'm sort of expected to write something about Perl today, so I'm going to write about Perl pie. Perl pies are a great treat. They're good for you, they're easy to make, and they require very little Perl expertise to make.

I don't want to put Perl in my mouth.

I don't either! Also, no baking is going to be required, and we're certainly not going to make anything in a microwave.

Okay, then, carry on.

Perl's command line switches are pretty darn cool. Last year, I wrote about the -M switch and some tricks you could pull with it. There are lots of poorly-known switches that can be put to great use, in there. I'd love to cover them all, but for now I'm going to start with -n.

Let's imagine we've got some input file, file.txt:

Alfa
Bravo
Charlie
Delta
Echo

The -n switch implicitly wraps our program in a loop like this:

LINE: while (&lt;&gt;) {
  # your program goes here
}

This is great for doing things you might otherwise do with awk or sed. I haven't used either of those in years, because of perl. For example, we could write this:

#!/usr/bin/perl -n
die &quot;bogus first character&quot; unless /\A[A-Z]/;
s/\A(.)\K/ is the abbreviation for $1/;
print;

...to get...

A is the abbreviation for Alfa
B is the abbreviation for Bravo
C is the abbreviation for Charlie
D is the abbreviation for Delta
E is the abbreviation for Echo

In fact, in my experience almost all programs I'd write with -n end with print, so I never use -n. Instead, I use -p, which is exactly the same but adds:

continue {
&nbsp;&nbsp;print or die &quot;-p destination: $!\n&quot;;
}

The general idea is that now your program is a set of transformations on repeated input, and that you're just editing the stream as it goes by, line by line. It's quite sed-y.

The -n and -p switches are both usable on the shebang line, but they're rarely seen there — it's pretty easy to type the loop out when you're making a program that you're going to keep around a while. They're much more commonly seen in one-liners with the famous and beloved -e (or its younger brother -E). Does your system lack nl for numbering lines? No problem:

~$ perl -pe &#39;printf &quot;%6u: &quot;, $.&#39; file.txt
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1: Alfa
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2: Bravo
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3: Charlie
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;4: Delta
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;5: Echo

(Remember $.? It's (mostly) the current line number of the file you're reading.)

Somebody deleted grep? And ack? Will, it sounds like you've got some personnel problems to deal with, but in the meantime, okay:

~$ perl -ne &#39;print &quot;$.: $_&quot; if /l/&#39; file.txt
1: Alfa
3: Charlie
4: Delta

Note that while we could have used -n in writing the first example, replacing sprintf with printf, but we had to use -n in the second example! Because the print is in a continue, you can't avoid printing by using next. For that, we must stick to -n.

I was told there would be pie.

Yes, well… from -p and -n and -e, we can make a Perl pen, but not a pie. For pie, we're obviously going to need some -i.

The -i switch will be familiar to sed-loving grognards. It lets us edit files on disk, using any value given to the switch as a backup file extension. So:

~$ cat file.txt
Alfa
Bravo
Charlie
Delta
Echo
~$ perl -p -i.bak -e &#39;s/[a-z]/-/g&#39; file.txt
~$ cat file.txt
A---
B----
C------
D----
E---
~$ cat file.txt.bak
Alfa
Bravo
Charlie
Delta
Echo

Now, using an argument to -i is a very good idea. Perl's handling of I/O errors when dealing with files with -i isn't the best, and you can lose data if you (or your operating system) screws up. That said… I don't think I ever actually use .bak or anything like that. That's what git is for, right? In my use, the most important reasons to know about that .bak option are (1) to inform other users so that I have plausible deniability when they ruin their unrecoverable data and (2) to remember that you cannot write perl -pie. That's why Perl pies look like this:

$ perl -pi -e &#39;s/../.../&#39; input.txt

Now bake me a pie!

I use Perl pies quite often, especially for doing mechanical refactoring of code. For example, let's say that I've done a bunch of work on making a library called Pumpkin::Walnut, and it's got a number of associated subclasses, and there's Pumpkin::WalnutX, etc. It turns out that for legal reasons, we can't call it Walnut and have to rebrand the whole thing as Pumpkin::Filbert. First we do a bit of renaming of the files in lib, possibly using rename, and then muck about in the files themselves:

This is a piece of cake (so to speak):

  $ perl -pi -e &#39;s/Pumpkin::Walnut/Pumpkin::Filbert/g&#39; $(find lib -type f)

...then review for absurdity by consulting git diff.

Adding editor hints to your files is trivial:

  $ perl -pi -e &#39;print &quot;%# vim: ft=mason:\n&quot; unless $did{$ARGV}++&#39; $(find mason -type f)

You can fix wonky newlines:

  $ perl -pi -e &#39;s/\x0A?\x0D/\n/g&#39; file.txt

...and of course you can do all sorts of things other than s///. Here's a longer-form of a one-liner I keep lying around:

~$ cat numbers.csv
5,7,7,9,14,13,9,3,0,6
18,6,17,15,5,19,2,0,16,12
5,3,5,5,9,13,19,13,4,17
16,16,14,1,10,2,10,2,11,9
15,1,14,14,18,12,4,10,16,16

~$ perl -MList::Util=sum -ani -F, -E &#39;say sum @F&#39; numbers.csv
~$ cat numbers.txt
73
110
93
91
120

It's a lot of fun to write big applications in Perl, using all the other libraries we talk about every other day on the Perl Advent Calendar, but sticking to plain old core Perl is still a pretty sweet way to solve tons of everyday problems.

Sweet Path::Class is Coming to Town

2012-12-01T00:00:00Z

File and directory paths. They start off so simple, but then depending on which system the script runs on, they might use slashes or backslashes. And then there are spaces that come to mess everything up. And then there are those systems that use drive letters and volume names. And so on, and so forth.

You know what? Give yourself the gift of simplicity these holidays, and begin to use Path::Class.

Path::Class wraps all those file and directory operations in object-oriented goodness that will warm your DWIM-hungry little heart.

To begin with, creating the objects is pretty straight-forward:

use Path::Class qw/ file dir /;

my $dir = dir(qw/ home santa children naughty / );

# could also have done
# my $dir = dir( '/home/santa/children/naughty' );
# and Path::Class would have understood

my $entry = file(qw/ home santa children naughty yanick /);

# or simply
#my $entry = $dir-&gt;file('yanick');

And because the dir and file objects auto-stringify to their representing strings, it means that you can use them just like any regular path strings:

say &quot;Uh oh&quot; if -f $entry;

opendir my $dh, $dir;
printf &quot;there are %d naughty children in this directory\n&quot;,
&nbsp;&nbsp;&nbsp;&nbsp;scalar grep { /^\.\.$/ } readdir $dh;

But as soon you discover the methods those objects have, you'll soooo not want to do that. Traveling up and down directory structures will now be a joy:

# down
my $subdir = $dir-&gt;subdir('really_naughty');

# up and down
my $good_ones = $dir-&gt;parent-&gt;subdir('nice');

say &quot;by the by, it's a relative path&quot; if $dir-&gt;is_relative;

Cleaning up complicated paths? A wonder:

say dir( '/home/santa/../santa////children/.//nice' )-&gt;resolve;
# yes, prints '/home/santa/children/nice'

Utilities and shortcuts to create tempfiles, iterate through the directory entries or traverse the directory structure? All there:

# create a temporary file
my ( $wishlist_fh, $wishlist_filename ) = $dir-&gt;tempfile;

say { $wishlist_fh } &quot;I want a $_&quot; for @gift_ideas;

# read the directory
while ( my $naughty_child = $naughty_dir-&gt;next ) {
&nbsp;&nbsp;&nbsp;&nbsp;$naughty_child-&gt;remove if $naughty_child =~ /yanick|rjbs/;
}

# go gift hunting
my @hidden = dir(qw/ dev rooms /)-&gt;traverse(sub {
&nbsp;&nbsp;&nbsp;&nbsp;my( $entry, $cont ) = @_;
    # grab any gift file
    return $cont-&gt;(), ($entry) x ( -f $entry and $entry =~ /\.gift$/; );
});

Regular file operations are likewise simplified via nifty methods:

my $list = file(qw/ home yanick xmas list /);

# read the whole file, split it in lines
my @wishlist = $list-&gt;slurp;

s/$/ pretty please/ for @wishlist;  # doesn't hurt to be polite

# and write back to the file
$list-&gt;spew(@wishlist);

# if one wants a filehandle...
my $fh = $list-&gt;openr;  # open for reading
while(my $wish = &lt;$fh&gt;){
&nbsp;&nbsp;&nbsp;&nbsp;say &quot;I want a $wish&quot;;
}

# want to touch a file?
file('/home/santa/helpers/glug')-&gt;touch;

# or remove one?
file('/home/santa/helpers/Belsnickel')-&gt;remove;

Trust me, once you begin using them, there'll be no going back. In fact, you'll probably wish they were also available as Moose types to use them everywhere natively.

And then you'll discover MooseX::Types::Path::Class, and the Christmas bells will be forevermore ringing.

Perl Advent Calendar 2012

So long until next year!

See Also

Have REST-ful Holidays

Have REST-ful Holidays

How to Explain REST to Anyone &hellip; even Ryan Tomayko&#39;s wife.

HTTP is Hard

It&#39;s a Time Machine!

Many Ways to Say the Same Thing

The Times They Are A Changing

[Somethign Witty HERE]

The Downsides

See Also

Give and Receive the Right Number of Gifts

Making a List

JavaScript

Data Interchange

Numification

JSON::Types

See Also

Generate static web sites using your favorite Perl framework

The best of two worlds

Static web sites

Web frameworks

Static web sites made with web frameworks

Blogging statically with your favorite framework

See Also

Set-based DBIx::Class

Set Based DBIx::Class

Chaining

Relationship Traversal

Subqueries

Christmas!

Correlated Subqueries

::ProxyResultSetMethod

::ProxyResultSetUpdate

Don&#39;t Stop!

See Also

Better Testing

Moose is slow!

Test::Aggregate

App::ForkProve

Tips and tricks

Caveats

TAP is ugly!

Test::Pretty

Tips and tricks

See Also

A Cache Present

USING CHI

FEATURES

Automatic key/value serialization

Multilevel caches

Miss stampede avoidance

Logging and statistics

See Also

Synchronous Operations are So Outdated

Understanding asynchronous events

Introduction to callbacks

Reading from input

Keeping the watchers alive

Timing your cooking

Condition variables with multiple calls

Bringing it all together

Just the beginning...

See Also

Santa Has Dependencies Too

SEE ALSO

Creating Your Own Perl

Twelve Lords A Leaping

List::Util / List::MoreUtils

PerlX::Maybe

Syntax::Keyword::Junction

aliased

Safe::Isa

Try::Tiny

NEXT

Web::Simple

autovivification

PerlX::QuoteOperator

How to Explain REST to Anyone … even Ryan Tomayko's wife.

It's a Time Machine!

Don't Stop!

It's already being used!

Lazily loaded modules won't be found

`Stopwatch`

`Rainbow`

Installing `adenosine` without CPAN

Is your code… Safe?

Have Your Party MC'ed by One of Santa's Close Collaborator

I don't want to put Perl in my mouth.