
About the Author Jan Dubois is the tech lead for all Perl technologies at ActiveState. He is a member of the core Perl5-Porters group that maintains the Perl language implementation. Jan wrote the Win32::OLE Automation support for Perl while he was still working in the financial industry. He has a Masters degree in Physics from the University of Hamburg, Germany. Jan now lives in Vancouver in Beautiful British Columbia, home of the 2010 Winter Olympics. |
“Curiouser and curiouser” cried Alice as she descended down the rabbit hole. In Event 9 of the 2008 Winter Scripting Games we’ll see if we can make this journey even more extraordinarily wonderful by looking at the world backwards.
The challenge
We need to read a single line from a text file, printing it out to the console, but reversing the characters in each word during the process.
How we are going to solve it
This problem is so simple that we can solve it with a single expression, directly on the command line:
C:>perl -pe "s/(\S+)/reverse $1/ge" c:\scripts\alice.txt
The –p option tells Perl to implicitly put a loop around the program like this:
while (<>) { ...; print }
This loops through the data file (specified as a command-line argument) line-by-line, performing whatever processing is requested by the user and printing the result.
The –e option tells Perl that the following argument is a literal script, and not the name of a file containing the script. Our script performs the following operation:
s/(\S+)/reverse $1/ge
The command uses a regexp search-and-replace to find each word (that is, each group of non-whitespace characters), replacing the characters in each word with those same characters, but in reverse order: reverse($1). The /e option tells the s/// operator to call eval() on the replacement expression before inserting it back into the string. After all, we want the replacement text to read “resuoiruC” and not literally “reverse(Curiouser)”. Finally the /g option makes sure this replacement operation is repeated until it no longer applies.
By using reverse() inside the replacement expression we implicitly put it into scalar context. This is important because, in list context, reverse() will not reverse the characters in a string, but the elements in its argument list. Since we are providing only a single argument it would have no effect in list context here.
Spicing it up
To amuse myself (I guess I’m easily amused) I wrote a script to do a more elaborate conversion: Don’t consider punctuation to be part of the words unless it is an apostrophe in the middle of a word (like “I’m”, or “won’t”). In addition, uppercase the first letter of the reversed word (and lowercase the last) if the original word had an uppercase initial. Finally, I wanted the result broken into nice lines instead of wrapping around at the end of the line of the console window:
use strict;
use warnings;
use File::Slurp qw(slurp);
use Text::Wrap qw(wrap);
for (slurp "C:/Scripts/alice.txt") {
my $output;
while (length) {
if (s/^([a-zA-Z]+(\'[a-zA-Z]+)*)//) {
my $word = reverse $1;
$word = ucfirst lc $word if $word =~ /[A-Z]$/;
$output .= $word;
}
else {
s/^([^a-zA-Z]+)//;
$output .= $1;
}
}
print wrap("", "", $output);
}
I’ve been getting a little tired of the “open(my $fh, "<", "…") or die” stuff to read a file, so I installed the File-Slurp package with PPM and used the slurp() function to read the file:
use File::Slurp qw(slurp);
I always specify the function names explicitly in all “use” statements instead of relying on the default symbol import mechanism. This makes maintenance much easier if I ever revise the script because I know I can stop loading the module when I’m no longer calling any of the functions I’m explicitly importing.
while (length) {
if (s/^([a-zA-Z]+(\'[a-zA-Z]+)*)//) {
my $word = reverse $1;
$word = ucfirst lc $word if $word =~ /[A-Z]$/;
$output .= $word;
}
else {
s/^([^a-zA-Z]+)//;
$output .= $1;
}
}
print wrap("", "", $output);
While there is still text in $_ we either chop off the first word, or the first block of non-word characters. Note how a word may contain an apostrophe as long as there is another letter immediately behind it. Both s/// expressions always match at the beginning of the string, and they remove whatever they match by replacing it with an empty string. That way the algorithm walks destructively through the input line until there is nothing left.
If the first s/// matched a word, then the characters of the word are reversed. If the last character is now uppercase, then the new word is transformed to lowercase completely and the first character is uppercased again. The result looks like this:
D:\tmp\ScriptingGames>perl ae9b.pl 'Resuoiruc dna resuoiruc!' deirc Ecila (ehs saw os hcum desirprus, taht rof eht tnemom ehs etiuq togrof woh ot kaeps doog Hsilgne); 'won M'i gninepo tuo ekil eht tsegral epocselet taht reve saw! Doog-eyb, teef!' (rof nehw ehs dekool nwod ta reh teef, yeht demees ot eb tsomla tuo fo thgis, yeht erew gnitteg os raf ffo). 'Ho, ym roop elttil teef, I rednow ohw lliw tup no ruoy seohs dna sgnikcots rof uoy won, sraed? M'i erus _I_ t'nahs eb elba! I llahs eb a taerg laed oot raf ffo ot elbuort flesym tuoba uoy: uoy tsum eganam eht tseb yaw uoy nac; --tub I tsum eb dnik ot meht,' thguoht Ecila, 'ro spahrep yeht t'now klaw eht yaw I tnaw ot og! Tel em ees: Ll'i evig meht a wen riap fo stoob yreve Samtsirhc.'
I think this reads much more wonderful than the output of the one-liner.
More one-liner madness
Patrick LeBoutillier sent me email showing a one-liner implementation for the prime number challenge in Event 6 that he adopted from code he found on the Net. The algorithm goes back to some of the JAPHs created by Abigail a long time ago. JAPHs are traditionally written in obfuscated code and are used to show off your mad Perl skillz. They have to share quite a bit of the blame for Perl code’s reputation of being hard to read.
C:>perl -E "say for grep { (1x$_) !~ /^(11+)\1+$/ } 2..200"
The –E option is very similar to –e: it tells Perl that the next argument on the command line is the script to be executed. But –E also implies “use 5.010”, so we can use all new Perl 5.10 features directly (like the “say” keyword above).
Abigail’s algorithm uses the Perl regexp engine to determine if a number has any natural divisors. Maybe it is easiest to explain how it works by walking through an example:
Let’s assume we want to see if 7 is a prime number. Then the algorithm constructs a string of 7 characters “1111111”. The regular expression then tests if this string matches /^(11)(11)+$/, or /^(111)(111)+/ or /^(1111)(1111)+/ etc. The second group in each pattern comes from the back reference \1.
This means the regexp engine is testing if the original string can be constructed by repeating a shorter string of 2 or more characters multiple times. If this is possible, then the length of the original string cannot be a prime because the length of the shorter string would be a divisor.
The grep() function above just filters all numbers between 2 and 200 that don’t match the pattern (and therefore must be prime numbers) and prints them.
Note that JAPH code is optimized for cleverness and obfuscation, not for efficiency and maintenance. Don’t write code like that at work! If you enjoy trying to figure out obfuscated code, or even just looking at it, then you’ll find an archive of old JAPHs at http://www.cpan.org/misc/japh.
Have you written solutions to other challenges in the 2008 Winter Scripting Games as one-liners? Please let me know at jand@activestate.com.