[ Team LiB ] |
4.6 The Dynamic Typing InterludeIf you have a background in compiled or statically-typed languages like C, C++, or Java, you might find yourself in a perplexed place at this point. So far, we've been using variables without declaring their types—and it somehow works. When we type a = 3 in an interactive session or program file, how does Python know that a should stand for an integer? For that matter, how does Python know what a even is at all? Once you start asking such questions, you've crossed over into the domain of Python's dynamic typing model. In Python, types are determined automatically at runtime, not in response to declarations in your code. To you, it means that you never declare variables ahead of time, and that is perhaps a simpler concept if you have not programmed in other languages before. Since this is probably the most central concept of the language, though, let's explore it in detail here. 4.6.1 How Assignments WorkYou'll notice that when we say a = 3, it works, even though we never told Python to use name a as a variable. In addition, the assignment of 3 to a seems to work too, even though we didn't tell Python that a should stand for an integer type object. In the Python language, this all pans out in a very natural way, as follows:
This model is strikingly different from traditional languages, and is responsible for much of Python's conciseness and flexibility. When you are first starting out, dynamic typing is usually easier to understand if you keep clear the distinction between names and objects. For example, when we say this: >>> a = 3 At least conceptually, Python will perform three distinct steps to carry out the request, which reflect the operation of all assignments in the Python language:
The net result will be a structure inside Python that resembles Figure 4-1. As sketched, variables and objects are stored in different parts of memory, and associated by links—shown as a pointer in the figure. Variables always link to objects (never to other variables), but larger objects may link to other objects. Figure 4-1. Names and objects, after a = 3These links from variables to objects are called references in Python—a kind of association.[9] Whenever variables are later used (i.e., referenced), the variable-to-object links are automatically followed by Python. This is all simpler than its terminology may imply. In concrete terms:
At least conceptually, each time you generate a new value in your script, Python creates a new object (i.e., a chunk of memory) to represent that value. Python caches and reuses certain kinds of unchangeable objects like small integers and strings as an optimization (each zero is not really a new piece of memory); but it works as though each value is a distinct object. We'll revisit this concept when we meet the == and is comparisons in Section 7.6 in Chapter 7. Let's extend the session and watch what happens to its names and objects: >>> a = 3 >>> b = a After typing these two statements, we generate the scene captured in Figure 4-2. As before, the second line causes Python to create variable b; variable a is being used and not assigned here, so it is replaced with the object it references (3); and b is made to reference that object. The net effect is that variables a and b wind up referencing the same object (that is, pointing to the same chunk of memory). This is called a shared reference in Python—multiple names referencing the same object. Figure 4-2. Names and objects, after b = aNext, suppose we extend the session with one more statement: >>> a = 3 >>> b = a >>> a = 'spam' As for all Python assignments, this simply makes a new object to represent the string value "spam", and sets a to reference this new object. It does not, however, change the value of b; b still refers to the original object, the integer 3. The resulting reference structure is as in Figure 4-3. Figure 4-3. Names and objects, after a = `spam'The same sort of thing would happen if we changed b to "spam" instead—the assignment would only change b, and not a. This example tends to look especially odd to ex-C programmers—it seems as though the type of a changed from integer to string, by saying a = 'spam'. But not really. In Python, things work more simply: types live with objects, not names. We simply change a to reference a different object. This behavior also occurs if there are no type differences at all. For example, consider these three statements: >>> a = 3 >>> b = a >>> a = 5 In this sequence, the same events transpire: Python makes variable a reference the object 3, and makes b reference the same object as a, as in Figure 4-2. As before, the last assignment only sets a to a completely different object, integer 5. It does not change b as a side effect. In fact, there is no way to ever overwrite the value of object 3 (integers can never be changed in place—a property called immutability). Unlike some languages, Python variables are always pointers to objects, not labels of changeable memory areas. 4.6.2 References and Changeable ObjectsAs you'll see later in this part's chapters, though, there are objects and operations that perform in-place object changes. For instance, assignment to offsets in lists actually changes the list object itself (in-place), rather than generating a brand new object. For objects that support such in-place changes, you need to be more aware of shared references, since a change from one name may impact others. For instance, list objects support in-place assignment to positions: >>> L1 = [2,3,4] >>> L2 = L1 As noted at the start of this chapter, lists are simply collections of other objects, coded in square brackets; L1 here is a list containing objects 2, 3, and 4. Items inside a list are accessed by their positions; L1[0] refers to object 2, the first item in the list L. Lists are also objects in their own right, just like integers and strings. After running the two prior assignments, L1 and L2 reference the same object, just like the prior example (see Figure 4-2). Also as before, if we now say this: >>> L1 = 24 then L1 is simply set to a different object; L2 is still the original list. If instead we change this statement's syntax slightly, however, it has radically different effect: >>> L1[0] = 24 >>> L2 [24, 3, 4] Here, we've changed a component of the object that L1 references, rather than changing L1 itself. This sort of change overwrites part of the list object in-place. The upshot is that the effect shows up in L2 as well, because it shares the same object as L1. This is usually what you want, but you should be aware of how this works so that it's expected. It's also just the default: if you don't want such behavior, you can request that Python copy objects, instead of making references. We'll explore lists in more depth, and revisit the concept of shared references and copies, in Chapter 6 and Chapter 7.[10]
4.6.3 References and Garbage CollectionWhen names are made to reference new objects, Python also reclaims the old object, if it is not reference by any other name (or object). This automatic reclamation of objects' space is known as garbage collection . This means that you can use objects liberally, without ever needing to free up space in your script. In practice, it eliminates a substantial amount of bookkeeping code compared to lower-level languages such as C and C++. To illustrate, consider the following example, which sets name x to a different object on each assignment. First of all, notice how the name x is set to a different type of object each time. It's as though the type of x is changing over time; but not really, in Python, types live with objects, not names. Because names are just generic references to objects, this sort of code works naturally: >>> x = 42 >>> x = 'shrubbery' # Reclaim 42 now (?) >>> x = 3.1415 # Reclaim 'shrubbery' now (?) >>> x = [1,2,3] # Reclaim 3.1415 now (?) Second of all, notice that references to objects are discarded along the way. Each time x is assigned to a new object, Python reclaims the prior object. For instance, when x is assigned the string 'shrubbery', the object 42 will be immediately reclaimed, as long as it is not referenced anywhere else—the object's space is automatically thrown back into the free space pool, to be reused for a future object. Technically, this collection behavior may be more conceptual than literal, for certain types. Because Python caches and reuses integers and small strings as mentioned earlier, the object 42 is probably not literally reclaimed; it remains to be reused the next time you generate a 42 in your code. Most kinds of objects, though, are reclaimed immediately when no longer referenced; for those that are not, the caching mechanism is irrelevant to your code. Of course, you don't really need to draw name/object diagrams with circles and arrows in order to use Python. When you are starting out, though, it sometimes helps you understand some unusual cases, if you can trace their reference structure. Moreover, because everything seems to be assignment and references in Python, a basic understanding of this model helps in many contexts—as we'll see, it works the same in assignment statements, for loop variables, function arguments, module imports, and more. |
[ Team LiB ] |