14.5 List Comprehensions
Because mapping operations over sequences and collecting results is
such a common task in Python coding, Python 2.0 sprouted a new
feature葉he list comprehension expression葉hat can make
this even simpler than using map and
filter. Technically, this feature is not tied to
functions, but we've saved it for this point in the
book, because it is usually best understood by analogy to
function-based alternatives.
14.5.1 List Comprehension Basics
Let's work through an example that demonstrates the
basics. Python's built-in ord
function returns the integer ASCII code of a single character:
>>> ord('s')
115
The chr built-in is the converse擁t returns
the character for an ASCII code integer. Now, suppose we wish to
collect the ASCII codes of all characters in an
entire string. Perhaps the most straightforward approach is to use a
simple for loop, and append results to a list:
>>> res = [ ]
>>> for x in 'spam':
... res.append(ord(x))
...
>>> res
[115, 112, 97, 109]
Now that we know about map, we can achieve similar
results with a single function call without having to manage list
construction in the code:
>>> res = map(ord, 'spam') # Apply func to seq.
>>> res
[115, 112, 97, 109]
But as of Python 2.0, we get the same results from a list
comprehension expression:
>>> res = [ord(x) for x in 'spam'] # Apply expr to seq.
>>> res
[115, 112, 97, 109]
List comprehensions collect the results of applying an arbitrary
expression to a sequence of values, and return them in a new list.
Syntactically, list comprehensions are enclosed in square brackets
(to remind you that they construct a list). In their simple form,
within the brackets, you code an expression that names a variable,
followed by what looks like a for loop header that
names the same variable. Python collects the
expression's results, for each iteration of the
implied loop.
The effect of the example so far is similar to both the manual
for loop, and the map call.
List comprehensions become more handy, though, when we wish to apply
an arbitrary expression to a sequence:
>>> [x ** 2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Here, we've collected the squares of the numbers 0
to 9. To do similar work with a map call, we would
probably invent a little function to implement the square operation.
Because we won't need this function elsewhere, it
would typically be coded inline, with a lambda:
>>> map((lambda x: x**2), range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
This does the same job, and is only a few keystrokes longer than the
equivalent list comprehension. For more advanced kinds of
expressions, though, list comprehensions will often be less for you
to type. The next section shows why.
14.5.2 Adding Tests and Nested Loops
List comprehensions are more general
than shown so far. For instance, you can code an
if clause after the for, to add
selection logic. List comprehensions with if
clauses can be thought of as analogous to the
filter built-in of the prior section葉hey
skip sequence items for which the if clause is not
true. Here are both schemes picking up even numbers from 0 to 4; like
map, filter invents a little
lambda function for the test expression. For
comparison, the equivalent for loop is shown here
as well:
>>> [x for x in range(5) if x % 2 == 0]
[0, 2, 4]
>>> filter((lambda x: x % 2 == 0), range(5))
[0, 2, 4]
>>> res = [ ]
>>> for x in range(5):
... if x % 2 == 0: res.append(x)
...
>>> res
[0, 2, 4]
All of these are using modulus (remainder of division) to detect
evens: if there is no remainder after dividing a number by two, it
must be even. The filter call is not much longer
than the list comprehension here either. However, the
combination of an if clause
and an arbitrary expression gives list comprehensions the effect of a
filter and a map, in a single
expression:
>>> [x**2 for x in range(10) if x % 2 == 0]
[0, 4, 16, 36, 64]
This time, we collect the squares of the even numbers from 0 to
9葉he for loop skips numbers for which the
attached if clause on the right is false, and the
expression on the left computes squares. The equivalent
map call would be more work on our part: we would
have to combine filter selections with
map iteration, making for a noticeably more
complex expression:
>>> map((lambda x: x**2), filter((lambda x: x % 2 == 0), range(10)))
[0, 4, 16, 36, 64]
In fact, list comprehensions are even more general still. You may
code nested for loops, and each may have an
associated if test. The general structure of list
comprehensions looks like this:
[ expression for target1 in sequence1 [if condition]
for target2 in sequence2 [if condition] ...
for targetN in sequenceN [if condition] ]
When for clauses are nested within a list
comprehension, they work like equivalent nested
for loop statements. For example, the following:
>>> res = [x+y for x in [0,1,2] for y in [100,200,300]]
>>> res
[100, 200, 300, 101, 201, 301, 102, 202, 302]
has the same effect as the substantially more verbose equivalent
statements:
>>> res = [ ]
>>> for x in [0,1,2]:
... for y in [100,200,300]:
... res.append(x+y)
...
>>> res
[100, 200, 300, 101, 201, 301, 102, 202, 302]
Although list comprehensions construct a list, remember that they can
iterate over any sequence type. Here's a similar bit
of code that traverses strings instead of lists of numbers, and so
collects concatenation results:
>>> [x+y for x in 'spam' for y in 'SPAM']
['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM',
'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM']
Finally, here is a much more complex list comprehension. It
illustrates the effect of attached if selections
on nested for clauses:
>>> [(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
This expression permutes even numbers from 0 to 4, with odd numbers
from 0 to 4. The if clauses filter out items in
each sequence iteration. Here's the equivalent
statement-based code溶est the list
comprehension's for and
if clauses inside each other to derive the
equivalent statements. The result is longer, but perhaps clearer:
>>> res = [ ]
>>> for x in range(5):
... if x % 2 == 0:
... for y in range(5):
... if y % 2 == 1:
... res.append((x, y))
...
>>> res
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]
The map and filter equivalent
would be wildly complex and nested, so we won't even
try showing it here. We'll leave its coding as an
exercise for Zen masters, ex-LISP programmers, and the criminally
insane.
14.5.3 Comprehending List Comprehensions
With such generality, list comprehensions can quickly become, well,
incomprehensible, especially when nested. Because of that, our advice
would normally be to use simple for loops when
getting started with Python, and map calls in most
other cases (unless they get too complex). The "Keep
It Simple" rule applies here, as always; code
conciseness is much less important a goal than code readability.
However, there is currently a substantial performance advantage to
the extra complexity in this case: based on tests run under Python
2.2, map calls are roughly twice as fast as
equivalent for loops, and list comprehensions are
usually very slightly faster than map. This speed
difference owes to the fact that map and list
comprehensions run at C language speed inside the interpreter, rather
than stepping through Python for loop code within
the PVM.
Because for loops make logic more explicit, we
recommend them in general on grounds of simplicity.
map, and especially list comprehensions, are worth
knowing if your application's speed is an important
consideration. In addition, because map and list
comprehensions are both expressions, they can show up syntactically
in places that for loop statements cannot, such as
in the bodies of lambda functions, within list and
dictionary literals, and more. Still, you should try to keep your
map calls and list comprehensions simple; for more
complex tasks, use full statements instead.
Here's a more realistic example of list
comprehensions and map in action. Recall that the
file readlines method returns lines with their
\n end-line character at the end:
>>> open('myfile').readlines( )
['aaa\n', 'bbb\n', 'ccc\n']
If you don't want the end-line, you can slice off
all lines in a single step, with either a list comprehension or a
map call:
>>> [line[:-1] for line in open('myfile').readlines( )]
['aaa', 'bbb', 'ccc']
>>> [line[:-1] for line in open('myfile')]
['aaa', 'bbb', 'ccc']
>>> map((lambda line: line[:-1]), open('myfile'))
['aaa', 'bbb', 'ccc']
The last two of these make use of file iterators
(it essentially means you don't need a
method call to grab all the lines, in iteration contexts such as
these). The map call is just slightly longer than
list comprehensions, but neither has to manage result list
construction explicitly.
List comprehensions can also be used as a sort of column projection
operation. Python's standard SQL database API
returns query results as a list of tuples葉he list is the
table, tuples are rows, and items in tuples are column values, much
like the following list:
listoftuple = [('bob', 35, 'mgr'), ('mel', 40, 'dev')]
A for loop could pick up all values from a
selected column manually, but map and list
comprehensions can do it in a single step, and faster:
>>> [age for (name, age, job) in listoftuple]
[35, 40]
>>> map((lambda (name, age, job): age), listoftuple)
[35, 40]
Both of these make use of tuple assignment to unpack row tuples in
the list. See other books and resources for more on
Python's database API.
|
|