2008 Winter Scripting Games

Jan Dubois' Solution to Advanced Perl Event 5: You Call That a Strong Password?

Event 1 Solution

About the Author

Jan Dubois is the tech lead for all Perl technologies at ActiveState. He is a member of the core Perl5-Porters group that maintains the Perl language implementation. Jan wrote the Win32::OLE Automation support for Perl while he was still working in the financial industry. He has a Masters degree in Physics from the University of Hamburg, Germany. Jan now lives in Vancouver in Beautiful British Columbia, home of the 2010 Winter Olympics.

*

You Call That a Strong Password?

Welcome back to another day at the 2008 Winter Scripting Games. Word on the street has it that starting with the fifth event the “Committee on the Empowerment of Passwords” is planning random and unannounced checks on the strength of the passwords used by the attendees. Don’t be caught with a weak password!

The challenge

We have to perform a number of checks to see if a given password conforms to a set of rules, awarding a point for each rule that the password passes. The total score is then compared to a ranking scale to determine the relative strength of the password.

How we are going to solve it

use strict;
use warnings;

use Text::Wrap qw(wrap);

my %word;
open(my $fh, "<", "C:/Scripts/wordlist.txt") or die;
while (<$fh>) {
    chomp;
    $word{lc()} = 1;
}
close($fh);

my @rules = (
    [ "Make sure that the password is not an actual word.",
      sub { !exists $word{lc()} },
    ],
    [ "Make sure that the password, minus the last letter, is not
       an actual word.",
      sub { !exists $word{substr(lc,0,-1)} },
    ],
    [ "Make sure that the password, minus the first letter, is not
       an actual word.",
      sub { !exists $word{substr(lc,1)} },
    ],
    [ "Make sure that the password does not simply substitute 0
       (zero) for the letter o (either an uppercase O or a
       lowercase o).",
      sub { !(s/0/O/g && exists $word{lc()}) },
    ],
    [ "Make sure that the password does not simply substitute 1
       (one) for the letter l (either an uppercase L or a
       lowercase l).",
      sub { !(s/1/L/g && exists $word{lc()}) },
    ],
    [ "Make sure that the password is at least 10 characters long
       but no more than 20 characters long.",
      sub { length() >= 10 and length() <= 20 },
    ],
    [ "Make sure that the password includes at least one number
       (the digits 0 through 9).",
      sub { /\d/ },
    ],
    [ "Make sure that the password includes at least one uppercase
       letter.",
      sub { /[A-Z]/ },
    ],
    [ "Make sure that the password includes at least one lowercase
       letter.",
      sub { /[a-z]/ },
    ],
    [ "Make sure that the password includes at least one symbol.",
      sub { /[^a-zA-Z0-9]/ },
    ],
    [ "Make sure that the password does not include four (or more)
       lowercase letters in succession.",
      sub { !/[a-z]{4}/ },
    ],
    [ "Make sure that the password does not include four (or more)
       uppercase letters in succession.",
      sub { !/[A-Z]{4}/ },
    ],
    [ "Make sure that the password does not include any duplicate
       characters.",
      sub { !/(.).*\1/ },
    ],
);

# Remove extra whitespace from multi-line rule names
$_->[0] =~ s/\s+/ /g for @rules;

my @ratings = (
#   [ min score => strength            ]
    [ @rules-2  => "strong"            ],
    [ @rules-6  => "moderately-strong" ],
    [     0     => "weak"              ],
);

########################################################################

my $password = shift;

my $score = 0;
my @failures;
for (@rules) {
    my($name, $test) = @$_;
    local $_ = $password;
    if ($test->()) {
        ++$score;
        next;
    }
    push @failures, $name;
}

if (@failures > 1) {
    print wrap("* ", "  ", $_), "\n\n" for @failures;
}
elsif (@failures) {
    print wrap("", "", $failures[0]), "\n\n";
}

for (@ratings) {
    my($min, $strong) = @$_;
    next if $score < $min;
    print "A password score of $score indicates a $strong password.\n";
    last;
}

As before, I’m only going to highlight any new techniques I haven’t explained in the earlier events:

use strict;
use warnings;

use Text::Wrap qw(wrap);

my %word;
open(my $fh, "<", "C:/Scripts/wordlist.txt") or die;
while (<$fh>) {
    chomp;
    $word{lc()} = $_;
}
close($fh);

First off we read the wordlist from the file and store the lowercase version of each word in a hash for quick lookup.You may be wondering why I wrote $word{lc()}: lc() without argument will return the lowercase version of $_, but why write down the parenthesis? This is necessary here because a bare “lc” inside the braces is interpreted by Perl as a literal string, so $word{lc} is the same as $word{"lc"} and not $word{lc($_)}.

my @rules = (
   [ "Make sure that the password is not an actual word.",
      sub { !exists $word{lc()} },
   ],

We have 13 rules, where each rule has some descriptive text and an implementation that tests if the rule has been met. I decided to store these rules in a data structure to separate them from the logic that tests the rules, keeps track of scores etc. This should make everything easier to read and maintain.

The @rules array contains a list of array references, one for each rule. Each rule in turn is just 2 elements: the rule descriptions, and a function reference of the implementation. The function for each rule must return a true value when the password doesn’t violate the rule and a false value otherwise.

We will set $_ to the password before each rule is called, so we don’t need to deal with argument passing.

    [ "Make sure that the password, minus the last letter, is not
       an actual word.",
       sub { !exists $word{substr(lc,0,-1)} },
    ],
    [ "Make sure that the password, minus the first letter, is  not
       an actual word.",
       sub { !exists $word{substr(lc,1)} },
    ],

The second argument of the substr() function is the starting offset, and the third one is the length. Both offset and length can be negative, in which case they are being counted from the end of the string. Therefore substr(lc,0,-1) is the lowercase substring of $_ from the start to the end, except for the last character.

Similarly, if the length is omitted, the substr() will include everything until the end of the string. So substr(lc,1) is the lowercase substring of $_ from the second character to the end.

[ "Make sure that the password does not simply substitute 0
   (zero) for the letter o (either an uppercase O or a
   lowercase o).",
   sub { !(s/0/O/g && exists $word{lc()}) },
],
[ "Make sure that the password does not simply substitute 1
   (one) for the letter l (either an uppercase L or a
   lowercase l).",
   sub { !(s/1/L/g && exists $word{lc()}) },
],

The two rules above try to replace all “0” or “1” characters by “O” or “L” respectively, and then check if the result is a known word. The /g modifier on the s/// substitution operator means that the operation should be repeated until it no longer applies.

These rules will only fail if both conditions are met: there is at least a single substitution and the result is a known word.

Since these rules are not tested together, they will not flag “t001b0x” because neither “t00lb0x” nor “too1box” are known words.

    [ "Make sure that the password is at least 10 characters long
       but no more than 20 characters long.",
       sub { length() >= 10 and length() <= 20 },
    ],
    [ "Make sure that the password includes at least one number
       (the digits 0 through 9).",
       sub { /\d/ },
    ],
    [ "Make sure that the password includes at least one uppercase
       letter.",
       sub { /[A-Z]/ },
    ],
    [ "Make sure that the password includes at least one lowercase
       letter.",
       sub { /[a-z]/ },
    ],
    [ "Make sure that the password includes at least one symbol.",
       sub { /[^a-zA-Z0-9]/ },
    ],

The contest rules were not very specific about which characters are considered symbols, so the last rule above treats everything except lowercase letters, uppercase letters, and digits as symbols. The caret character “^” at the beginning of the character class [^a-zA-Z0-9] indicates an inverted class, matching all characters except those explicitly listed.

    [ "Make sure that the password does not include four (or more)
       lowercase letters in succession.",
       sub { !/[a-z]{4}/ },
    ],
    [ "Make sure that the password does not include four (or more)
   uppercase letters in succession.",
       sub { !/[A-Z]{4}/ },
    ],
    [ "Make sure that the password does not include any duplicate
       characters.",
       sub { !/(.).*\1/ },
    ],
);

The last rule uses a backwards reference \1 to the earlier capture group. The capture group matches any character: “(.)”. The rule triggers if this character appears again, possibly separated from the first occurrence by any number of other characters.

# Remove extra whitespace from multi-line rule names
$_->[0] =~ s/\s+/ /g for @rules;

You will have noticed that some of the rule descriptions spanned multiple lines, so they will contain the newline character as well as additional spaces from the indentation. This needs to be removed so we have nice strings for printing later on.

my @ratings = (
#   [ min score => strength            ]
    [ @rules-2  => "strong"            ],
    [ @rules-6  => "moderately-strong" ],
    [     0     => "weak"              ],
);

The contest rules had specific numbers for each strength level: scores 11-13 are considered “strong”, scores 7-10 are “moderately-strong”, and everything below is “weak”.

I decided to put the rating rules into another table, so that they, too can easily be changed later, without having to edit the algorithm itself. I also decided that these levels should be defined relative to the number of rules in @rules: You can miss up to 3 rules and still be strong, or miss up to 7 rules and still be considered “moderately-strong”. This way we can add another rule and don’t have to tweak the ratings table. However, if we add lots of additional rules, then we may want to relax the ranking requirements a bit to compensate.

The ratings must be ordered from strongest to weakest, so that we can just walk through the entries and stop as soon as the current strength is larger or equal to the one specified for each level. The minimum strength 0 entry at the end makes sure that we won’t walk past the end of the table because it is impossible to have a score lower than 0.

Note that it will also be trivial to add another level to @rankings if we ever decide to do so:

my @ratings = (
#   [ min score => strength            ]
    [ @rules-2  => "strong"            ],
    [ @rules-6  => "moderately-strong" ],
    [ @rules-9  => "pretty-weak"       ],
    [     2     => "weak"              ],
    [     0     => "trivial"           ],
);

Now we are ready to read the password from the command line and do some rule-checking:

my $password = shift;

my $score = 0;
my @failures;

I decided to keep a list of all failed rules in @failures and only print them at the end.

for (@rules) {
    my($name, $test) = @$_;

While walking through the rules, we remember that each rule is an array reference, with the elements of the referenced array being the name and the function reference for this rule.

    local $_ = $password;

We set $_ to the password before calling each rule. This allows the rules to modify the value in $_ without interfering with any other rule.

    if ($test->()) {
        ++$score;
        next;
    }
    push @failures, $name;
}

$test->() invokes the code reference stored in $test with an empty parameter list. We increment the $score if the rule passes. Otherwise we note the rule name in the @failures list.

if (@failures > 1) {
    print wrap("* ", "  ", $_), "\n\n" for @failures;
}
elsif (@failures) {
    print wrap("", "", $failures[0]), "\n\n";
}

Some of the rule names are pretty long and would wrap at an arbitrary point (80 columns when displayed in a standard console window). The Text::Wrap module contains a nice wrap() function that we can use to break the text at proper word boundaries before we reach the full terminal width.

I also wanted to display the failed rules as a bulleted list if we had more than one failure; that makes everything easier to read. The wrap() function can do this too, by prefixing the first line in each rule with “* ”, and all other lines with 2 spaces:

C:\>perl ae5.pl too1box
* Make sure that the password does not simply substitute 1 (one) for the
  letter l (either an uppercase L or a lowercase l).

* Make sure that the password is at least 10 characters long but no more
  than 20 characters long.

* Make sure that the password includes at least one uppercase letter.

* Make sure that the password includes at least one symbol.

* Make sure that the password does not include any duplicate characters.

The only thing left to do is to calculate and display the strength rating:

for (@ratings) {
    my($min, $strong) = @$_;
    next if $score < $min;
    print "A password score of $score indicates a $strong password.\n";
    last;
}

For our sample “too1box” it displays:

A password score of 8 indicates a moderately-strong password.

Personally I find some of the rules used in this event questionable: Why is a password of 21 characters weaker than one with 15 characters? Supposedly you will be able to remember the 15 character password, but have to write down the one with 21 characters… Anyways, please let me know at jand@activestate.com which rules you would use to calculate a password’s strength!

PS: Thanks to Bram for pointing out a logic error in one of my rules!


Top of pageTop of page