DekGenius.com
[ Team LiB ] Previous Section Next Section

17.1 Package Import Basics

Here's how package imports work. In the place where we have been naming a simple file in import statements, we can instead list a path of names separated by periods:

import dir1.dir2.mod

The same goes for from statements:

from dir1.dir2.mod import x

The "dotted" path in these statements is assumed to correspond to a path through the directory hierarchy on your machine, leading to the file mod.py (or other file type). That is, there is directory dir1, which has a subdirectory dir2, which contains a module file mod.py (or other suffix).

Furthermore, these imports imply that dir1 resides within some container directory dir0, which is accessible on the Python module search path. In other words, the two import statements imply a directory structure that looks something like this (shown with DOS backslash separators):

dir0\dir1\dir2\mod.py             # Or mod.pyc,mod.so,...

The container directory dir0 still needs to be added to your module search path (unless it's the home directory of the top-level file), exactly as if dir1 were a module file. From there down the import statements in your script give the directory path leading to the module explicitly.

17.1.1 Packages and Search Path Settings

If you use this feature, keep in mind that the directory paths in your import statements can only be variables separated by periods. You cannot use any platform-specific path syntax in your import statements; things like C:\dir1, My Documents.dir2, and ../dir1, do not work syntactically. Instead, use platform-specific syntax in your module search path settings to name the container directory.

For instance, in the prior example, dir0—the directory name you add to your module search path—can be an arbitrarily long and platform-specific directory path leading up to dir1. Instead of using an invalid statement like this:

import C:\mycode\dir1\dir2\mod      # Error: illegal syntax

add C:\mycode to your PYTHONPATH variable or .pth files, unless it is the program's home directory, and say this:

import dir1.dir2.mod

In effect, entries on the module search path provide platform-specific directory path prefixes, which lead to the leftmost names in import statements. Import statements provide directory path tails in a platform neutral fashion.[1]

[1] The dot path syntax was chosen partly for platform neutrality, but also because paths in import statements become real nested object paths. This syntax also means that you get odd error messages if you forget to omit the .py in your import statements: import mod.py is assumed to be a directory path import—it loads mod.py, then tries to load a mod \py.py, and ultimately issues a potentially confusing error message.

17.1.2 Package __init__.py Files

If you choose to use package imports, there is one more constraint you must follow. Each directory named within the path of a package import statement must also contain a file named __init__.py, or else your package imports will fail. In the example we've been using, both dir1 and dir2 must contain a file called __init__.py; the container directory dir0 does not require such a file, because it's not listed in the import statement itself. More formally, for a directory structure such as:

dir0\dir1\dir2\mod.py

and an import statement of the form:

import dir1.dir2.mod

the following rules apply:

  • dir1 and dir2 both must contain an __init__.py file.

  • dir0, the container, does not require an __init__.py; it will simply be ignored if present.

  • dir0 must be listed on the module search path (home directory, PYTHONPATH, etc.), not dir0\dir1.

The net effect is that this example's directory structure should be as follows, with indentation designating directory nesting:

dir0\                       # Container on module search path
    dir1\
        __init__.py
        dir2\
            __init__.py
            mod.py

These __init__.py files contain Python code, just like normal module files. They are partly present as a declaration to Python, and can be completely empty. As a declaration, these files serve to prevent directories with a common name from unintentionally hiding true modules that occur later on the module search path. Otherwise, Python may pick a directory that has nothing to do with your code, just because it appears in an earlier directory on the search path.

More generally, this file serves as a hook for package initialization-time actions, serves to generate a module namespace for a directory, and implements the behavior of from* (i.e., from ... import *) statements when used with directory imports:


Package initialization

The first time Python imports through a directory, it automatically runs all the code in the directory's __init__.py file. Because of that, these files are a natural place to put code to initialize the state required by files in the package. For instance, a package might use its initialization file to create required data files, open connections to databases, and so on. Typically, __init__.py files are not meant to be useful if executed directly; they are run automatically during imports, the first time Python goes through a directory.


Module namespace initialization

In the package import model, the directory paths in your script become real nested object paths after the import. For instance, in the example above, the expression dir1.dir2 works, and returns a module object whose namespace contains all the names assigned by dir2's __init__.py file. Such files provide a namespace for modules that have no other file.


From* statement behavior

As an advanced feature, you can use __all__ lists in __init__.py files to define what is exported when a directory is imported with the from* statement form. (We'll meet __all__ in Chapter 18.) In an __init__.py file, the __all__ list is taken to be the list of submodule names that should be imported when from* is used on the package (directory) name. If __all__ is not set, the from* does not automatically load submodules nested in the directory, but instead loads just names defined by assignments in the directory's __init__.py file, including any submodules explicitly imported by code in this file. For instance, a statement from submodule import X in a directory's __init__.py makes name X available in that directory's namespace.

You can also simply leave these files empty, if their roles are beyond your needs. They must really exist, though, for your directory imports to work at all.

    [ Team LiB ] Previous Section Next Section