Perl Advent Calendar 2006-12-05

Wacket auf!

by Shlomi Fish

First off, a confession: I spent so much time today trying to integrate the subject of this feature with Vim, I decided to write about it here. I was initailly unsuccessful due to a bug in present in my relatively old version, since fixed on CPAN. That's what you get for being an "early adopter".

Andy 'petdance' Lester recently released the useful command line utility ack. It is similar to the venerable grep, and primarily intended for scanning trees of code, especially such trees that contain code in many different languages. ack command lines tend to be much shorter than the equivalent grep -r or find command lines. ack's only non-core requirement is Andy's File::Next, which is similar to File::Find::Object

Let's start... searching!

Our corpus for this example is the parrot source tree, which is about as heterogenous as a source tree as you can get. For the terminally curious: we grabbed a copy from the subversion repository, and then installed ack from CPAN in the customary manner like so:

$ svn co -r 15920 http://svn.perl.org/parrot/trunk
#Command output omitted.
$ perl -MCPAN -e "install App::Ack"
#Command output omitted.

Now we can cd into the parrot trunk and start playing with it, and note that it marks up its output with color on ANSI capable terminals.

$ ack 'fprintf.*Bad expr' .
languages/m4/src/eval.c
291:      fprintf( stderr, "Bad expression in eval (missing right parenthesis): %s",
296:      fprintf( stderr, "Bad expression in eval: %s", expr);
300:      fprintf( stderr, "Bad expression in eval (bad input): %s", expr);
304:      fprintf( stderr, "Bad expression in eval (excess input): %s", expr);

ack accepts a perl regular expression as its first argument. The other arguments are path names. We specified "." (the current directory) and as we can see ack proceeded to recurse the branch, which it does by default. The first thing you probably notice is that the user-friendly output format. However, it will produce a more machine readable, colorless grep-like format (${filename}:${line_num}:${line}) if its output is piped to another program:

$ ack 'fprintf.*Bad expr' . | cat
languages/m4/src/eval.c:291:      fprintf( stderr, "Bad expression in eval (missing right parenthesis): %s",
languages/m4/src/eval.c:296:      fprintf( stderr, "Bad expression in eval: %s", expr);
languages/m4/src/eval.c:300:      fprintf( stderr, "Bad expression in eval (bad input): %s", expr);
languages/m4/src/eval.c:304:      fprintf( stderr, "Bad expression in eval (excess input): %s", expr);

This allows it to be used as a drop-in replacement for grep in text editors and other tools. Something else you might notice is that despite the fact that it operated on a Subversion working copy, it didn't display results from the copies of the files inside the .svn directories. This is more than we can say for a simple grep -r:

$ grep -rn 'fprintf.*Bad expr' .
./languages/m4/src/.svn/text-base/eval.c.svn-base:291:      fprintf( stderr, "Bad expression in eval (missing right parenthesis): %s",
./languages/m4/src/.svn/text-base/eval.c.svn-base:296:      fprintf( stderr, "Bad expression in eval: %s", expr);
./languages/m4/src/.svn/text-base/eval.c.svn-base:300:      fprintf( stderr, "Bad expression in eval (bad input): %s", expr);
./languages/m4/src/.svn/text-base/eval.c.svn-base:304:      fprintf( stderr, "Bad expression in eval (excess input): %s", expr);
./languages/m4/src/eval.c:291:      fprintf( stderr, "Bad expression in eval (missing right parenthesis): %s",
./languages/m4/src/eval.c:296:      fprintf( stderr, "Bad expression in eval: %s", expr);
./languages/m4/src/eval.c:300:      fprintf( stderr, "Bad expression in eval (bad input): %s", expr);
./languages/m4/src/eval.c:304:      fprintf( stderr, "Bad expression in eval (excess input): %s", expr);

In typical DWIM fashion, ack does not descend into such directories as .svn, blib, CVS, because they obscure the useful results.

Irregular Expressions

As you're probably aware, there are many regular expression dialects in common use. Since ack is written in perl, it makes the full power of perl regular expressions available to you. So for example we can say:

$ ack '\b\$pattern\s*' .

This is much more convenient and less confusing than the myriad flavors of grep out there, including those with PCRE syntax which is not fully compatible with perl's regular expressions. As an aside, note that PCRE support is not even available in all modern builds of GNU grep.

Filetype Identification

ack has options to search specific file types. For example we can say:

$ ack --perl fprintf .
tools/build/pbc2c.pl
266:            fprintf(stderr, "\t" INTVAL_FMT ": %s\n", i, argv[i]);

tools/dev/lib_deps.pl
410:fprintf     stdio.h
619:vfprintf    stdio.h

To look for occurences of "fprintf" in Perl files. Note that --perl will search several popular extensions for Perl files, but also files whose shebang-lines point to perl. We can even specify more than one file type, or there longopt --no$lang counterpart.

Integration with Editors

Integrating ack with Vim

A naive way to see ack results in vim's quickfix buffer would be to use the cexpr command:

:cexpr system('ack --perl map .')

An alternative would be to set the :grep command to use ack instead of grep:

:set grepprg=ack

Then one can write :grep [ack arguments] to search using ack. However this prevents the use of grep itself, which may or may not be an issue for you. A better solution can be had by adding the following lines to one's .vimrc file:

function! Ack_Search(command)
    cexpr system("ack " . a:command)
endfunction

command! -nargs=+ -complete=file Ack call Ack_Search(<q-args>)

This snippet defines a new Ex command called ":Ack" that searches using ack and displays the results. Whereby one might issue:

:Ack --perl map .

Integrating ack with Emacs

To integrate ack with XEmacs, include this code in ~/.xemacs/custom.el. It will provide access via "M-x ack".

This code was only tested in XEmacs and may require some adaptation for GNU Emacs.

And then?

ack has many other nifty features, and like any piece of software still under development it has some bugs too. So grab a copy and grep^Wack away, submitting patches along the way.