DekGenius.com
[ Team LiB ] Previous Section Next Section

7.2 Files

Most readers are probably familiar with the notion of files—named storage compartments on your computer that are managed by your operating system. This last built-in object type provides a way to access those files inside Python programs. The built-in open function creates a Python file object, which serves as a link to a file residing on your machine. After calling open, you can read and write the associated external file, by calling file object methods. The built-in name file is a synonym for open, and files may be opened by calling either name.

Compared to the types you've seen so far, file objects are somewhat unusual. They're not numbers, sequences, or mappings; instead, they export methods only for common file processing tasks.

Table 7-2 summarizes common file operations. To open a file, a program calls the open function, with the external name first, followed by a processing mode ('r' to open for input—the default; 'w' to create and open for output; 'a' to open for appending to the end; and others we'll omit here). Both arguments must be Python strings. The external file name argument may include a platform-specific and absolute or relative directory path prefix; without a path, the file is assumed to exist in the current working directory (i.e., where the script runs).

Table 7-2. Common file operations

Operation

Interpretation

output = open('/tmp/spam', 'w')

Create output file ('w' means write).

input = open('data', 'r')

Create input file ('r' means read).

S = input.read( )

Read entire file into a single string.

S = input.read(N)

Read N bytes (1 or more).

S = input.readline( )

Read next line (through end-line marker).

L = input.readlines( )

Read entire file into list of line strings.

output.write(S)

Write string S into file.

output.writelines(L)

Write all line strings in list L into file.

output.close( )

Manual close (done for you when file collected).

Once you have a file object, call its methods to read from or write to the external file. In all cases, file text takes the form of strings in Python programs; reading a file returns its text in strings, and text is passed to the write methods as strings. Reading and writing both come in multiple flavors; Table 7-2 gives the most common.

Calling the file close method terminates your connection to the external file. In Python, an object's memory space is automatically reclaimed as soon as the object is no longer referenced anywhere in the program. When file objects are reclaimed, Python also automatically closes the file if needed. Because of that, you don't need to always manually close your files, especially in simple scripts that don't run long. On the other hand, manual close calls can't hurt and are usually a good idea in larger systems. Strictly speaking, this auto-close-on-collection feature of files is not part of the language definition, and may change over time. Because of that, manual file close method calls are a good habit to form.

7.2.1 Files in Action

Here is a simple example that demonstrates file-processing basics. It first opens a new file for output, writes a string (terminated with an newline marker, '\n'), and closes the file. Later, the example opens the same file again in input mode, and reads the line back. Notice that the second readline call returns an empty string; this is how Python file methods tell you that you've reached the end of the file (empty lines in the file come back as strings with just a newline character, not empty strings).

>>> myfile = open('myfile', 'w')             # Open for output (creates).
>>> myfile.write('hello text file\n')        # Write a line of text.
>>> myfile.close(  )

>>> myfile = open('myfile', 'r')             # Open for input.
>>> myfile.readline(  )                           # Read the line back.
'hello text file\n'
>>> myfile.readline(  )                        # Empty string: end of file
''

There are additional, more advanced file methods not shown in Table 7-2; for instance, seek resets your current position in a file (the next read or write happens at the position), flush forces buffered output to be written out to disk (by default, files are always buffered), and so on.

The sidebar Why You Will Care: File Scanners in Chapter 10 sketches common file-scanning loop code patterns, and the examples in later parts of this book discuss larger file-based code. In addition, the Python standard library manual and the reference books described in the Preface provide a complete list of file methods.

    [ Team LiB ] Previous Section Next Section