2023 twenty-four merry days of Perl Feed

Trimming your holiday tree

builtin - 2023-12-04

It's time to pull out your holiday decorations. If you are like me, you probably have a couple of new decorations to add to your collection.

Perl v5.36 has some new ornaments too. The new pragma builtin defines several new functions in Perl's core. I go through these in Perl New Features, but there's one I want to show you this holiday season.

Remove surrounding whitespace

Typically I get rid of the annoying whitespace with an inefficient global substitution that I can't make my fingers not type:

use v5.10;
my $dirty_string = ' Happy Holidays ';
say "Dirty: <$dirty_string>";

$dirty_string =~ s/\A\s+|\s+\z//ug;
say "Trimmed: <$dirty_string>";

That /u tells the substitution to use the Unicode rules, which means this also trims all the other whitespace that Unicode defines. I'm especially fond of that one after chasing down a bug caused by a someone using an en space (U+2002) instead of a regular space (likely from an export from layout software).

Or, more likely, I keep the original as is and make a copy with the /r flag from v5.14. This way I can log the original value later when I'm trying to figure out why some inputs don't work right (or someone is passing bad data):

use v5.26;
my $dirty_string = ' Happy Holidays ';
my $trimmed = $dirty_string =~ s/\A\s+|\s+\z//ugr;

say <<~"HERE";
Dirty: <$dirty_string>
 Trimmed: <$trimmed>
HERE

This is such a common operation that it deserves its own name— trim. The builtin pragma is still experimental, so I turn off those warnings with experimental before I enable trim:

use v5.36;
use experimental qw(builtin);
use builtin qw(trim);

my $dirty_string = ' Happy Holidays ';
my $trimmed = trim($dirty_string);

say <<~"HERE";
Dirty: <$dirty_string>
 Trimmed: <$trimmed>
HERE

All of these get you to the same output:

        Dirty:   <   Happy Holidays   >
        Trimmed: <Happy Holidays>

More ornaments

builtin provides other new ornaments. I give detailed examples of these in Perl New Features.

The blessed, refaddr, reftype, is_tainted (as tainted), is_weak (as isweak), weaken, and unweaked functions have so far been part of the Scalar::Util module. Now they are directly in core and do the same job.

The ceil and floor functions move from the heavyweight POSIX module directly into core.

More excitingly, builtin adds true and false: no more !!1 floating around your code confusing everyone. These "distinguished" booleans know that they are booleans rather than comparing themselves to Perl's idea of truthiness. These are useful for turning data into formats that have special keywords for those ideas, such as JSON's true or false. You can check if you have one these distinguished values with is_bool.

Along with that, builtin knows if something is a string or a number based on the last thing that happened to it. With created_as_number or created_as_string, you know if you need to create the JSON string value {"value": 137} or the JSON string value {"value": "137"}. This should work going the other way too. Eventually the various libraries will catch up to Perl so we don't have to do weird acrobatics to figure out what sort of value we are supposed to have.

There are a few other interesting features for you to discover in builtin, and I hope many of these lose their experimental status soon.

Gravatar Image This article contributed by: brian d foy <bdfoy@cpan.org>