Learning Perl Objects References and Modules-Learning Perl Objects References and Modules

6.5 Returning a Subroutine from a Subroutine

Although a naked block worked nicely to define the callback, having a subroutine return that subroutine reference instead might be more useful:

use File::Find;

sub create_find_callback_that_counts {
  my $count = 0;
  return sub { print ++$count, ": $File::Find::name\n" };
}

my $callback = create_find_callback_that_counts(  );
find($callback, ".");

It's the same process here, just written a bit differently. When you invoke create_find_callback_that_counts( ), a lexical variable $count is initialized to 0. The return value from that subroutine is a reference to an anonymous subroutine that is also a closure because it accesses the $count variable. Even though $count goes out of scope at the end of the create_find_callback_that_counts( ) subroutine, there's still a binding between it and the returned subroutine reference, so the variable stays alive until the subroutine reference is finally discarded.

If you reuse the callback, the same variable still has its most recently used value. The initialization occurred in the original subroutine (create_find_callback_that_counts), not the callback (unnamed) subroutine:

use File::Find;

sub create_find_callback_that_counts {
  my $count = 0;
  return sub { print ++$count, ": $File::Find::name\n" };
}

my $callback = create_find_callback_that_counts(  );
print "my bin:\n";
find($callback, "bin");
print "my lib:\n";
find($callback, "lib");

This example prints consecutive numbers starting at 1 for the entries below my bin, but then continues the numbering when you start entries in lib. The same $count variable is used in both cases. However, if you invoke the create_find_callback_that_counts( ) twice, you get two different $count variables:

use File::Find;

sub create_find_callback_that_counts {
  my $count = 0;
  return sub { print ++$count, ": $File::Find::name\n" };
}

my $callback1 = create_find_callback_that_counts(  );
my $callback2 = create_find_callback_that_counts(  );
print "my bin:\n";
find($callback1, "bin");
print "my lib:\n";
find($callback2, "lib");

In this case, you have two separate $count variables, each accessed from within their own callback subroutine.

How would you get the total size of all found files from the callback? Earlier, you were able to do this by making $total_size visible. If you stick the definition of $total_size into the subroutine that returns the callback reference, you won't have access to the variable. But you can cheat a bit. For one thing, you can determine that the callback subroutine is never called with any parameters, so, if the subroutine is called with a parameter, you can make it return the total size:

use File::Find;

sub create_find_callback_that_sums_the_size {
  my $total_size = 0;
  return sub {
    if (@_) { # it's our dummy invocation
      return $total_size;
    } else { # it's a callback from File::Find:
      $total_size += -s if -f;
    }
  };
}

my $callback = create_find_callback_that_sums_the_size(  );
find($callback, "bin");
my $total_size = $callback->("dummy"); # dummy parameter to get size
print "total size of bin is $total_size\n";

Distinguishing actions by the presence or absence of parameters is not a universal solution. Fortunately, more than one subroutine reference can be created in create_find_callback_that_counts( ):

use File::Find;

sub create_find_callbacks_that_sum_the_size {
  my $total_size = 0;
  return(sub { $total_size += -s if -f }, sub { return $total_size });
}

my ($count_em, $get_results) = create_find_callbacks_that_sum_the_size(  );
find($count_em, "bin");
my $total_size = &$get_results(  );
print "total size of bin is $total_size\n";

Because both subroutine references were created from the same scope, they both have access to the same $total_size variable. Even though the variable has gone out of scope before either subroutine is called, they still share the same heritage and can use the variable to communicate the result of the calculation.

The two subroutine references are not invoked by returning their references from the creating subroutine. The references are just data at that point. It's not until you invoke them as a callback or an explicit subroutine derefencing that they actually do their duty.

What if you invoke this new subroutine more than once?

use File::Find;

sub create_find_callbacks_that_sum_the_size {
  my $total_size = 0;
  return(sub { $total_size += -s if -f }, sub { return $total_size });
}

## set up the subroutines
my %subs;
foreach my $dir (qw(bin lib man)) {
  my ($callback, $getter) = create_find_callbacks_that_sum_the_size(  );
  $subs{$dir}{CALLBACK} = $callback;
  $subs{$dir}{GETTER} = $getter;
}

## gather the data
for (keys %subs) {
  find($subs{$_}{CALLBACK}, $_);
}

## show the data
for (sort keys %subs) {
  my $sum = $subs{$_}{GETTER}->(  );
  print "$_ has $sum bytes\n";
}

In the "set up the subroutines" section, you create three instances of callback-and-getter pairs. Each callback has a corresponding subroutine to get the results. Next, in the "gather the data" section, you call find three times with each corresponding callback subroutine reference. This updates the individual $total_size variables associated with each callback. Finally, in the "show the data" section, you call the getter routines to fetch the results.

The six subroutines (and the three $total_size variables they share) are reference-counted. When %subs goes away or is modified, the values have their reference counts reduced, recycling the contained data. (If that data also references further data, those reference counts are also reduced appropriately.)

[ Team LiB ]