DekGenius.com
[ Team LiB ] Previous Section Next Section

14.5 List Comprehensions

Because mapping operations over sequences and collecting results is such a common task in Python coding, Python 2.0 sprouted a new feature葉he list comprehension expression葉hat can make this even simpler than using map and filter. Technically, this feature is not tied to functions, but we've saved it for this point in the book, because it is usually best understood by analogy to function-based alternatives.

14.5.1 List Comprehension Basics

Let's work through an example that demonstrates the basics. Python's built-in ord function returns the integer ASCII code of a single character:

>>> ord('s')
115

The chr built-in is the converse擁t returns the character for an ASCII code integer. Now, suppose we wish to collect the ASCII codes of all characters in an entire string. Perhaps the most straightforward approach is to use a simple for loop, and append results to a list:

>>> res = [  ]
>>> for x in 'spam': 
...     res.append(ord(x))
...
>>> res
[115, 112, 97, 109]

Now that we know about map, we can achieve similar results with a single function call without having to manage list construction in the code:

>>> res = map(ord, 'spam')            # Apply func to seq.
>>> res
[115, 112, 97, 109]

But as of Python 2.0, we get the same results from a list comprehension expression:

>>> res = [ord(x) for x in 'spam']    # Apply expr to seq.
>>> res
[115, 112, 97, 109]

List comprehensions collect the results of applying an arbitrary expression to a sequence of values, and return them in a new list. Syntactically, list comprehensions are enclosed in square brackets (to remind you that they construct a list). In their simple form, within the brackets, you code an expression that names a variable, followed by what looks like a for loop header that names the same variable. Python collects the expression's results, for each iteration of the implied loop.

The effect of the example so far is similar to both the manual for loop, and the map call. List comprehensions become more handy, though, when we wish to apply an arbitrary expression to a sequence:

>>> [x ** 2 for x in range(10)]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Here, we've collected the squares of the numbers 0 to 9. To do similar work with a map call, we would probably invent a little function to implement the square operation. Because we won't need this function elsewhere, it would typically be coded inline, with a lambda:

>>> map((lambda x: x**2), range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

This does the same job, and is only a few keystrokes longer than the equivalent list comprehension. For more advanced kinds of expressions, though, list comprehensions will often be less for you to type. The next section shows why.

14.5.2 Adding Tests and Nested Loops

List comprehensions are more general than shown so far. For instance, you can code an if clause after the for, to add selection logic. List comprehensions with if clauses can be thought of as analogous to the filter built-in of the prior section葉hey skip sequence items for which the if clause is not true. Here are both schemes picking up even numbers from 0 to 4; like map, filter invents a little lambda function for the test expression. For comparison, the equivalent for loop is shown here as well:

>>> [x for x in range(5) if x % 2 == 0]
[0, 2, 4]

>>> filter((lambda x: x % 2 == 0), range(5))
[0, 2, 4]

>>> res = [  ]
>>> for x in range(5):
...     if x % 2 == 0: res.append(x)
...        
>>> res
[0, 2, 4]

All of these are using modulus (remainder of division) to detect evens: if there is no remainder after dividing a number by two, it must be even. The filter call is not much longer than the list comprehension here either. However, the combination of an if clause and an arbitrary expression gives list comprehensions the effect of a filter and a map, in a single expression:

>>> [x**2 for x in range(10) if x % 2 == 0]
[0, 4, 16, 36, 64]

This time, we collect the squares of the even numbers from 0 to 9葉he for loop skips numbers for which the attached if clause on the right is false, and the expression on the left computes squares. The equivalent map call would be more work on our part: we would have to combine filter selections with map iteration, making for a noticeably more complex expression:

>>> map((lambda x: x**2), filter((lambda x: x % 2 == 0), range(10)))
[0, 4, 16, 36, 64]

In fact, list comprehensions are even more general still. You may code nested for loops, and each may have an associated if test. The general structure of list comprehensions looks like this:

[ expression for target1 in sequence1 [if condition]
             for target2 in sequence2 [if condition] ...
             for targetN in sequenceN [if condition] ]

When for clauses are nested within a list comprehension, they work like equivalent nested for loop statements. For example, the following:

>>> res = [x+y for x in [0,1,2] for y in [100,200,300]]
>>> res
[100, 200, 300, 101, 201, 301, 102, 202, 302]

has the same effect as the substantially more verbose equivalent statements:

>>> res = [  ]
>>> for x in [0,1,2]:
...     for y in [100,200,300]:
...         res.append(x+y)
...
>>> res
[100, 200, 300, 101, 201, 301, 102, 202, 302]

Although list comprehensions construct a list, remember that they can iterate over any sequence type. Here's a similar bit of code that traverses strings instead of lists of numbers, and so collects concatenation results:

>>> [x+y for x in 'spam' for y in 'SPAM']
['sS', 'sP', 'sA', 'sM', 'pS', 'pP', 'pA', 'pM', 
'aS', 'aP', 'aA', 'aM', 'mS', 'mP', 'mA', 'mM']

Finally, here is a much more complex list comprehension. It illustrates the effect of attached if selections on nested for clauses:

>>> [(x,y) for x in range(5) if x%2 == 0 for y in range(5) if y%2 == 1]
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]

This expression permutes even numbers from 0 to 4, with odd numbers from 0 to 4. The if clauses filter out items in each sequence iteration. Here's the equivalent statement-based code溶est the list comprehension's for and if clauses inside each other to derive the equivalent statements. The result is longer, but perhaps clearer:

>>> res = [  ]
>>> for x in range(5):
...     if x % 2 == 0:
...         for y in range(5):
...             if y % 2 == 1:
...                 res.append((x, y))
...
>>> res
[(0, 1), (0, 3), (2, 1), (2, 3), (4, 1), (4, 3)]

The map and filter equivalent would be wildly complex and nested, so we won't even try showing it here. We'll leave its coding as an exercise for Zen masters, ex-LISP programmers, and the criminally insane.

14.5.3 Comprehending List Comprehensions

With such generality, list comprehensions can quickly become, well, incomprehensible, especially when nested. Because of that, our advice would normally be to use simple for loops when getting started with Python, and map calls in most other cases (unless they get too complex). The "Keep It Simple" rule applies here, as always; code conciseness is much less important a goal than code readability.

However, there is currently a substantial performance advantage to the extra complexity in this case: based on tests run under Python 2.2, map calls are roughly twice as fast as equivalent for loops, and list comprehensions are usually very slightly faster than map. This speed difference owes to the fact that map and list comprehensions run at C language speed inside the interpreter, rather than stepping through Python for loop code within the PVM.

Because for loops make logic more explicit, we recommend them in general on grounds of simplicity. map, and especially list comprehensions, are worth knowing if your application's speed is an important consideration. In addition, because map and list comprehensions are both expressions, they can show up syntactically in places that for loop statements cannot, such as in the bodies of lambda functions, within list and dictionary literals, and more. Still, you should try to keep your map calls and list comprehensions simple; for more complex tasks, use full statements instead.

Why You Will Care: List Comprehensions and map

Here's a more realistic example of list comprehensions and map in action. Recall that the file readlines method returns lines with their \n end-line character at the end:

>>> open('myfile').readlines(  )
['aaa\n', 'bbb\n', 'ccc\n']

If you don't want the end-line, you can slice off all lines in a single step, with either a list comprehension or a map call:

>>> [line[:-1] for line in open('myfile').readlines(  )]
['aaa', 'bbb', 'ccc']

>>> [line[:-1] for line in open('myfile')]
['aaa', 'bbb', 'ccc']

>>> map((lambda line: line[:-1]), open('myfile'))
['aaa', 'bbb', 'ccc']

The last two of these make use of file iterators (it essentially means you don't need a method call to grab all the lines, in iteration contexts such as these). The map call is just slightly longer than list comprehensions, but neither has to manage result list construction explicitly.

List comprehensions can also be used as a sort of column projection operation. Python's standard SQL database API returns query results as a list of tuples葉he list is the table, tuples are rows, and items in tuples are column values, much like the following list:

listoftuple = [('bob', 35, 'mgr'), ('mel', 40, 'dev')]

A for loop could pick up all values from a selected column manually, but map and list comprehensions can do it in a single step, and faster:

>>> [age for (name, age, job) in listoftuple]
[35, 40]
>>> map((lambda (name, age, job): age), listoftuple)
[35, 40]

Both of these make use of tuple assignment to unpack row tuples in the list. See other books and resources for more on Python's database API.


    [ Team LiB ] Previous Section Next Section