Learning Perl Objects References and Modules-Learning Perl Objects References and Modules

A.3 Answers for Chapter 4

A.3.1 Exercise 1 (Section 4.9.1)

The curly braces of the anonymous hash constructor make a reference to a hash. That's a scalar (as are all references), so it's not suitable to use alone as the value of a hash. Perhaps this code's author intended to assign to scalar variables (like $passenger_1 and $passenger_2) instead of to hashes. But you can fix the problem simply by changing the two pairs of curly braces to parentheses.

If you tried running this, Perl may have given you a helpful diagnostic message as a warning. If you didn't get the warning, perhaps you didn't have warnings turned on, either with the -w switch or with the use warnings pragma. Even if you don't usually use Perl's warnings, you should enable them during debugging. (How long would it take you to debug this without Perl's warnings to help you? How long would it take to enable Perl's warnings? 'Nuff said.)

What if you got the warning message but couldn't tell what it meant? That's what the perldiag manpage is for. Warning texts need to be concise because they're compiled into the perl binary (the program that runs your Perl code). But perldiag should list all the messages you should ever get from Perl, along with a longer explanation of what each one means, why it's a problem, and how to fix it.

If you want to be ultimately lazy, you can add use diagnostics; at the beginning of your program, and any error message will look itself up in the documentation and display the entire detailed message. Don't leave this in production code, however, unless you like burning a lot of CPU cycles every time your program starts, whether or not an error occurs.

A.3.2 Exercise 2 (Section 4.9.2)

You will be keeping count of how much data has been sent to all machines, so at the start, set the variable $all to a name that will stand in for all of them. It should be a name that will never be used for any real machine, of course. Storing it in a variable is convenient for writing the program and makes it easy to change later.

my $all = "**all machines**";

The input loop is nearly the same as given in the chapter, but it skips comment lines. Also, it keeps a second running total, filed under $all.

my %total_bytes;
while (<>) {
  next if /^#/;
  my ($source, $destination, $bytes) = split;
  $total_bytes{$source}{$destination} += $bytes;
  $total_bytes{$source}{$all} += $bytes;
}

Next, make a sorted list. This holds the names of the source machines in descending order of total transferred bytes. This list is used for the outer for loop. (Rather than using a temporary array, @sources, you might have put the sort directly into the parens of the for loop.)

my @sources =
  sort { $total_bytes{$b}{$all} <=> $total_bytes{$a}{$all} }
  keys %total_bytes;

for my $source (@sources) {
  my @destinations = 
    sort { $total_bytes{$source}{$b} <=> $total_bytes{$source}{$a} }
    keys %{ $total_bytes{$source} };
  print "$source: $total_bytes{$source}{$all} total bytes sent\n";
  for my $destination (@destinations) {
    next if $destination eq $all;
    print "  $source => $destination:",
      " $total_bytes{$source}{$destination} bytes\n";
  }
  print "\n";
}

Inside the loop, print out the total number of bytes sent from that source machine, then make sorted list of the destination files (similar to the list in @sources). As you step through that list, use next to skip over the dummy $all item. Because that item will be at the head of the sorted list, why wasn't shift used to discard it, since that would avoid checking repeatedly for $all inside the inner loop? The answer is in this footnote.^[1]

^[1] Even though the dummy item will sort to the head of the sorted list, it won't necessarily be the first item in the list. If a machine sent data to just one other, that destination machine's total will be equal to the source machine's total output, so that list could sort in either order.

You can simplify this program, perhaps. The subexpression $total_bytes{$source} is used many times in the large output for loop (and twice in the input loop as well). That can be replaced by a simple scalar, initialized at the top of the loop:

for my $source (@sources) {
  my $tb = $total_bytes{$source};
  my @destinations = sort { $tb{$b} <=> $tb{$a} } keys %$tb;
  print "$source: $tb{$all} total bytes sent\n";
  for my $destination (@destinations) {
    next if $destination eq $all;
    print "  $source => $destination: $tb{$destination} bytes\n";
  }
  print "\n";
}

This makes the code shorter and (likely) a bit faster as well. Give yourself extra credit if you thought to do this. Also give yourself extra credit if you thought that it might be too confusing and decided not to make the change.

[ Team LiB ]