DekGenius.com
[ Team LiB ] Previous Section Next Section

27.2 Conversions, Numbers, and Comparisons

While we've covered data types, one of the common issues when dealing with any type system is how one converts from one type to another. These conversions happen in a myriad of contexts—reading numbers from a text file, computing integer averages, interfacing with functions that expect different types than the rest of an application, etc.

We've seen in previous chapters that we can create a string from a nonstring object by simply passing the nonstring object to the str string constructor. Similarly, unicode converts any object to its Unicode string form and returns it.[1]

[1] As we're not going to be subclassing from built-in types in this chapter, it makes no difference to us whether these conversion calls are functions (which they were until recent versions of Python) or class creators (which they are in Python 2.2 or later)—either way, they take objects as input and return new objects of the appropriate type (assuming the specific conversion is allowed). In this section we'll refer to them as functions as a matter of convenience.

In addition to the string creation functions, we've seen list and tuple, which take sequences and return list and tuple versions of them, respectively. int, complex, float, and long take any number and convert it to their respective types. int, long, and float have additional features that can be confusing. First, int and long truncate their numeric arguments, if necessary, to perform the operation, thereby losing information and performing a conversion that may not be what you want (the round built-in rounds numbers the standard way and returns a float). Second, int, long, and float can also convert strings to their respective types, provided the strings are valid integer (or long, or float) literals. Literals are the text strings that are converted to numbers early in the Python compilation process. So, the string 1244 in your Python program file (which is necessarily a string) is a valid integer literal, but def foo( ): isn't.

>>> int(1.0), int(1.4), int(1.9), round(1.9), int(round(1.9))
(1, 1, 1, 2.0, 2)
>>> int("1")
1
>>> int("1.2")                             # This doesn't work.
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for int(  ): 1.2

What's a little odd is that the rule about conversion (if it's a valid integer literal) is more important than the feature about truncating numeric arguments, thus:

>>> int("1.0")                               # Neither does this
Traceback (most recent call last):           # since 1.0 is also not a valid 
  File "<stdin>", line 1, in ?               # integer literal.
ValueError: invalid literal for int(  ): 1.0

Given the behavior of int, it may make sense in some cases to use a custom variant that does only conversion, refusing to truncate:

>>> def safeint(candidate):
...   converted = float(candidate)
...   rounded = round(converted)
...   if converted == rounded:
...         return int(converted)
...   else: 
...         raise ValueError, "%s would lose precision when cast"%candidate
...
>>> safeint(3.0)
3
>>> safeint("3.0")
3
>>> safeint(3.1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 8, in safeint
ValueError: 3.1 would lose precision when cast

Converting numbers to strings can be done in a variety of ways. In addition to using str or unicode, one can use hex or oct, which take integers (whether int or long) as arguments and return string representations of them in hexadecimal or octal format, respectively.

>>> hex(1000), oct(1000)
('0x3e8', '01750')

The abs built-in returns the absolute value of scalars (integers, longs, floats) and the magnitude of complex numbers (the square root of the sum of the squared real and imaginary parts):

>>> abs(-1), abs(-1.2), abs(-3+4j)
(1, 1.2, 5.0)                               # 5 is sqrt(3*3 + 4*4).

The ord and chr functions return the ASCII value of single characters and vice versa:

>>> map(ord, "test")    # Remember that strings are sequences
[116, 101, 115, 116]    # of characters, so map can be used.
>>> chr(64)
'@'
>>> ord('@')
64
# map returns a list of single characters, so it
# needs to be "joined" into a str.
>>> map(chr, (83, 112, 97, 109, 33))
['S', 'p', 'a', 'm', '! ']
# Can also be spelled using list comprehensions
>>> [chr(x) for x in (83, 112, 97, 109, 33)]
['S', 'p', 'a', 'm', '! ']
>>> ''.join([chr(x) for x in (83, 112, 97, 109, 33)])
'Spam!'

The cmp built-in returns a negative integer, 0, or a positive integer, depending on whether its first argument is less than, equal to, or greater than its second one. It's worth emphasizing that cmp works with more than just numbers; it compares characters using their ASCII values, and sequences are compared by comparing their elements. Comparisons can raise exceptions, so the comparison function is not guaranteed to work on all objects, but all reasonable comparisons will work.[2] The comparison process used by cmp is the same as that used by the sort method of lists. It's also used by the built-ins min and max, which return the smallest and largest elements of the objects they are called with, dealing reasonably with sequences:

[2] For a variety of mostly historical reasons, even some unreasonable comparisons (1 > "2") will yield a value.

>>> min("pif", "paf", "pof")        # When called with multiple arguments,
'paf'                               # return appropriate one.
>>> min("ZELDA!"), max("ZELDA!")    # when called with a sequence, 
'!', 'Z'                            # return the min/max element of it.

Table 27-1 summarizes the built-in functions dealing with type conversions. Many of these can also be called with no argument to return a false value; for example, str( ) returns the empty string.

Table 27-1. Type conversion built-in functions

Function name

Behavior

str(string)unicode(string)

Returns the string representation of any object:

>>> str(dir(  ))
"['__builtins__', '__doc__', '__name__']"
>>> unicode('tomato')
u' tomato'

list(seq)

Returns the list version of a sequence:

>>> list("tomato")
['t', 'o', 'm', 'a', 't', 'o']
>>> list((1,2,3))
[1, 2, 3]

tuple(seq)

Returns the tuple version of a sequence:

>>> tuple("tomato")
('t', 'o', 'm', 'a', 't', 'o')
>>> tuple([0])
(0,)

dict( )dict(mapping)dict(seq)dict(**kwargs)

Creates a dictionary from its argument, which can be a mapping, a sequence, keyword arguments or nothing (yielding the empty dictionary):

>>> dict(  )
{  }
>>> dict([('a', 2), ('b', 5)])
{'a': 2, 'b': 5}
>>> dict(a=2, b=5) # In Python 2.3 or later
{'a': 2, 'b': 5}

int(x)

Converts a string or number to a plain integer; truncates floating-point values. The string needs to be a valid string literal (i.e., no decimal point).

>>> int("3") 
3

long(x)

Converts a string or number to a long integer; truncates floating-point values:

>>> long("3")
3L

float(x)

Converts a string or a number to floating point:

>>> float("3")
3.0

complex(real,imag)

Creates a complex number with the value real + imag*j:

>>> complex(3,5)
(3+5j)

hex(i)

Converts an integer number (of any size) to a hexadecimal string:

>>> hex(10000)
'0x2710'

oct(i)

Converts an integer number (of any size) to an octal string:

>>> oct(10000)
'023420'

ord(char)

Returns the numeric value of a string of one character (using the current default encoding, often ASCII):

>>> ord('A')
65

chr(i)

Returns a string of one character whose numeric code in the current encoding (often ASCII) is the integer i:

>>> chr(65)
'A'

min(i [, i]*)

Returns the smallest item of a nonempty sequence:

>>> min([5,1,2,3,4])
1
>>> min(5,1,2,3,4)
1

max(i [, i]*)

Returns the largest item of a nonempty sequence:

>>> max([5,1,2,3,4])
5
>>> max(5,1,2,3,4)
5

file(name [, mode [,buffering])

Opens a file .

>>> data = file('contents.txt', 'r').read(  )

    [ Team LiB ] Previous Section Next Section