DekGenius.com
[ Team LiB ] Previous Section Next Section

4.6 The Dynamic Typing Interlude

If you have a background in compiled or statically-typed languages like C, C++, or Java, you might find yourself in a perplexed place at this point. So far, we've been using variables without declaring their types—and it somehow works. When we type a = 3 in an interactive session or program file, how does Python know that a should stand for an integer? For that matter, how does Python know what a even is at all?

Once you start asking such questions, you've crossed over into the domain of Python's dynamic typing model. In Python, types are determined automatically at runtime, not in response to declarations in your code. To you, it means that you never declare variables ahead of time, and that is perhaps a simpler concept if you have not programmed in other languages before. Since this is probably the most central concept of the language, though, let's explore it in detail here.

4.6.1 How Assignments Work

You'll notice that when we say a = 3, it works, even though we never told Python to use name a as a variable. In addition, the assignment of 3 to a seems to work too, even though we didn't tell Python that a should stand for an integer type object. In the Python language, this all pans out in a very natural way, as follows:


Creation

A variable, like a, is created when it is first assigned a value by your code. Future assignments change the already-created name to have a new value. Technically, Python detects some names before your code runs; but conceptually, you can think of it as though assignments make variables.


Types

A variable, like a, never has any type information or constraint associated with it. Rather, the notion of type lives with objects, not names. Variables always simply refer to a particular object, at a particular point in time.


Use

When a variable appears in an expression, it is immediately replaced with the object that it currently refers to, whatever that may be. Further, all variables must be explicitly assigned before they can be used; use of unassigned variables results in an error.

This model is strikingly different from traditional languages, and is responsible for much of Python's conciseness and flexibility. When you are first starting out, dynamic typing is usually easier to understand if you keep clear the distinction between names and objects. For example, when we say this:

>>> a = 3

At least conceptually, Python will perform three distinct steps to carry out the request, which reflect the operation of all assignments in the Python language:

  1. Create an object to represent the value 3.

  2. Create the variable a, if it does not yet exist.

  3. Link the variable a to the new object 3.

The net result will be a structure inside Python that resembles Figure 4-1. As sketched, variables and objects are stored in different parts of memory, and associated by links—shown as a pointer in the figure. Variables always link to objects (never to other variables), but larger objects may link to other objects.

Figure 4-1. Names and objects, after a = 3
figs/lpy2_0401.gif

These links from variables to objects are called references in Python—a kind of association.[9] Whenever variables are later used (i.e., referenced), the variable-to-object links are automatically followed by Python. This is all simpler than its terminology may imply. In concrete terms:

[9] Readers with a background in C may find Python references similar to C pointers (memory addresses). In fact, references are implemented as pointers, and often serve the same roles, especially with objects that can be changed in place (more on this later). However, because references are always automatically dereferenced when used, you can never actually do anything useful with a reference itself; this is a feature, which eliminates a vast category of C bugs. But, you can think of Python references as C "void*" pointers, which are automatically followed whenever used.

  • Variables are simply entries in a search table, with space for a link to an object.

  • Objects are just pieces of allocated memory, with enough space to represent the value they stand for, and type tag information.

At least conceptually, each time you generate a new value in your script, Python creates a new object (i.e., a chunk of memory) to represent that value. Python caches and reuses certain kinds of unchangeable objects like small integers and strings as an optimization (each zero is not really a new piece of memory); but it works as though each value is a distinct object. We'll revisit this concept when we meet the == and is comparisons in Section 7.6 in Chapter 7.

Let's extend the session and watch what happens to its names and objects:

>>> a = 3
>>> b = a

After typing these two statements, we generate the scene captured in Figure 4-2. As before, the second line causes Python to create variable b; variable a is being used and not assigned here, so it is replaced with the object it references (3); and b is made to reference that object. The net effect is that variables a and b wind up referencing the same object (that is, pointing to the same chunk of memory). This is called a shared reference in Python—multiple names referencing the same object.

Figure 4-2. Names and objects, after b = a
figs/lpy2_0402.gif

Next, suppose we extend the session with one more statement:

>>> a = 3
>>> b = a
>>> a = 'spam'

As for all Python assignments, this simply makes a new object to represent the string value "spam", and sets a to reference this new object. It does not, however, change the value of b; b still refers to the original object, the integer 3. The resulting reference structure is as in Figure 4-3.

Figure 4-3. Names and objects, after a = `spam'
figs/lpy2_0403.gif

The same sort of thing would happen if we changed b to "spam" instead—the assignment would only change b, and not a. This example tends to look especially odd to ex-C programmers—it seems as though the type of a changed from integer to string, by saying a = 'spam'. But not really. In Python, things work more simply: types live with objects, not names. We simply change a to reference a different object.

This behavior also occurs if there are no type differences at all. For example, consider these three statements:

>>> a = 3
>>> b = a
>>> a = 5

In this sequence, the same events transpire: Python makes variable a reference the object 3, and makes b reference the same object as a, as in Figure 4-2. As before, the last assignment only sets a to a completely different object, integer 5. It does not change b as a side effect. In fact, there is no way to ever overwrite the value of object 3 (integers can never be changed in place—a property called immutability). Unlike some languages, Python variables are always pointers to objects, not labels of changeable memory areas.

4.6.2 References and Changeable Objects

As you'll see later in this part's chapters, though, there are objects and operations that perform in-place object changes. For instance, assignment to offsets in lists actually changes the list object itself (in-place), rather than generating a brand new object. For objects that support such in-place changes, you need to be more aware of shared references, since a change from one name may impact others. For instance, list objects support in-place assignment to positions:

>>> L1 = [2,3,4]
>>> L2 = L1

As noted at the start of this chapter, lists are simply collections of other objects, coded in square brackets; L1 here is a list containing objects 2, 3, and 4. Items inside a list are accessed by their positions; L1[0] refers to object 2, the first item in the list L.

Lists are also objects in their own right, just like integers and strings. After running the two prior assignments, L1 and L2 reference the same object, just like the prior example (see Figure 4-2). Also as before, if we now say this:

>>> L1 = 24

then L1 is simply set to a different object; L2 is still the original list. If instead we change this statement's syntax slightly, however, it has radically different effect:

>>> L1[0] = 24
>>> L2
[24, 3, 4]

Here, we've changed a component of the object that L1 references, rather than changing L1 itself. This sort of change overwrites part of the list object in-place. The upshot is that the effect shows up in L2 as well, because it shares the same object as L1.

This is usually what you want, but you should be aware of how this works so that it's expected. It's also just the default: if you don't want such behavior, you can request that Python copy objects, instead of making references. We'll explore lists in more depth, and revisit the concept of shared references and copies, in Chapter 6 and Chapter 7.[10]

[10] Objects that can be changed in-place are known as mutables—lists and dictionaries are mutable built-ins, and hence susceptible to in-place change side-effects.

4.6.3 References and Garbage Collection

When names are made to reference new objects, Python also reclaims the old object, if it is not reference by any other name (or object). This automatic reclamation of objects' space is known as garbage collection . This means that you can use objects liberally, without ever needing to free up space in your script. In practice, it eliminates a substantial amount of bookkeeping code compared to lower-level languages such as C and C++.

To illustrate, consider the following example, which sets name x to a different object on each assignment. First of all, notice how the name x is set to a different type of object each time. It's as though the type of x is changing over time; but not really, in Python, types live with objects, not names. Because names are just generic references to objects, this sort of code works naturally:

>>> x = 42
>>> x = 'shrubbery'     # Reclaim 42 now (?)
>>> x = 3.1415          # Reclaim 'shrubbery' now (?)
>>> x = [1,2,3]         # Reclaim 3.1415 now (?)

Second of all, notice that references to objects are discarded along the way. Each time x is assigned to a new object, Python reclaims the prior object. For instance, when x is assigned the string 'shrubbery', the object 42 will be immediately reclaimed, as long as it is not referenced anywhere else—the object's space is automatically thrown back into the free space pool, to be reused for a future object.

Technically, this collection behavior may be more conceptual than literal, for certain types. Because Python caches and reuses integers and small strings as mentioned earlier, the object 42 is probably not literally reclaimed; it remains to be reused the next time you generate a 42 in your code. Most kinds of objects, though, are reclaimed immediately when no longer referenced; for those that are not, the caching mechanism is irrelevant to your code.

Of course, you don't really need to draw name/object diagrams with circles and arrows in order to use Python. When you are starting out, though, it sometimes helps you understand some unusual cases, if you can trace their reference structure. Moreover, because everything seems to be assignment and references in Python, a basic understanding of this model helps in many contexts—as we'll see, it works the same in assignment statements, for loop variables, function arguments, module imports, and more.

    [ Team LiB ] Previous Section Next Section