DekGenius.com
[ Team LiB ] Previous Section Next Section

7.1 Review of Sorting

Perl's built-in sort operator sorts text strings in their natural text order, by default.[1]

[1] My friends call that the "ASCIIbetical" ordering. Normally modern Perl doesn't use ASCII; instead, it uses a default sort order, depending on the current locale and character set; see the perllocale (not perllocal!) manpage.

This is fine if you want to sort text strings:

my @sorted = sort qw(Gilligan Skipper Professor Ginger Mary_Ann);

but gets pretty messy when you want to sort numbers:

my @wrongly_sorted = sort 1, 2, 4, 8, 16, 32;

The resulting list is 1, 16, 2, 32, 4, 8. Why didn't sort order these properly? It treats each item as a string and sorts them in string order. Any string that begins with 3 sorts before any string that begins with 4.

You can fix this by overriding how Perl compares pairs of items in the list. By default, as Perl orders the items, a string comparison is used. A new comparison is specified using a sort block, placed between the sort keyword and the list of things to sort.[2]

[2] A sort block can also be specified as a subroutine name, causing the subroutine to be invoked when a comparison is needed.

Within the sort block, $a and $b stand in for two of the items to be sorted. The last evaluated expression must return a -1, 0, or +1 value.[3] If the value is -1, the value currently in $a must appear before the value in $b in the final sorted list. If the value is +1, then the value in $a must appear after the value in $b in the final sorted list. If the value is 0, you don't know or can't tell, so the results are unpredictable.[4]

[3] Actually, you can use any negative or positive number in place of -1 and +1, respectively.

[4] Recent Perl versions include a default sorting engine that is stable, so zero returns from the sort block cause the relative ordering of $a and $b to reflect their order in the original list. Older versions of Perl didn't guarantee such stability, and a future version might not use a stable sort, so don't rely on it.

For example, to sort those numbers in their proper order, you can use a sort block comparing $a and $b, like so:

my @numerically_sorted = sort {
  if ($a < $b)    { -1 }
  elsif ($a > $b) { +1 }
  else            {  0 }
} 1, 2, 4, 8, 16, 32;

Now you have a proper numeric comparison, so you have a proper numeric sort. Of course, this is far too much typing, so you can use the spaceship operator instead:

my @numerically_sorted = sort { $a <=> $b } 1, 2, 4, 8, 16, 32;

The spaceship operator returns -1, 0, and +1, according to the rules discussed. A descending sort is as simple as reversing the position of $a and $b:

my @numerically_descending = sort { $b <=> $a } 1, 2, 4, 8, 16, 32;

In every place the previous sort expression returned -1, this expression returns +1, and vice versa. Thus, the sort is in the opposite order. It's also easy to remember because if $a is to the left of $b, you get out the lower items first, just like a and b would be in the resulting list.

Likewise, a string sort can be indicated with cmp, although this is used less often because it is the default comparison. The cmp operator is handier when you have a more complex comparison, as you'll see shortly.

    [ Team LiB ] Previous Section Next Section