Learning Python 2nd Edition-Learning Python 2nd Edition

B.8 Part VIII, The Outer Layers

B.8.1 Chapter 27, Common Tasks in Python

See Section 27.9 for the exercises.

Avoiding regular expressions. This program is long and tedious, but not especially complicated. See if you can understand how it works. Whether this is easier for you than regular expressions depends on many factors, such as your familiarity with regular expressions and your comfort with the functions in the string module. Use whichever type of programming works for you.

file = open('pepper.txt')
text = file.read(  )
paragraphs = text.split('\n\n')

def find_indices_for(big, small):
    indices = [  ]
    cum = 0
    while 1:
        index = big.find(small)
        if index == -1:
            return indices
        indices.append(index+cum)
        big = big[index+len(small):]
        cum = cum + index + len(small)

def fix_paragraphs_with_word(paragraphs, word):
    lenword = len(word)
    for par_no in range(len(paragraphs)):
        p = paragraphs[par_no]
        wordpositions = find_indices_for(p, word)
        if wordpositions == [  ]: return
        for start in wordpositions:
            # Look for 'pepper' ahead.
            indexpepper = p.find('pepper')
            if indexpepper == -1: return -1
            if p[start:indexpepper].strip(  ):
                # Something other than whitespace in between!
                continue
            where = indexpepper+len('pepper')
            if p[where:where+len('corn')] == 'corn':
                # It's immediately followed by 'corn'!
                continue
            if p.find('salad') < where:
                # It's not followed by 'salad'.
                continue
            # Finally! We get to do a change!
            p = p[:start] + 'bell' + p[start+lenword:]
            paragraphs[par_no] = p         # Change mutable argument!

fix_paragraphs_with_word(paragraphs, 'red')
fix_paragraphs_with_word(paragraphs, 'green')

for paragraph in paragraphs:
    print paragraph+'\n'

We won't repeat the output here; it's the same as that of the regular expression solution.

Wrapping a text file with a class. This one is surprisingly easy, if you understand classes and the split function in the string module. The following is a version that has one little twist over and beyond what we asked for:

class FileStrings:
    def __init__(self, filename=None, data=None):
        if data == None:
            self.data = open(filename).read(  )
        else:
            self.data = data
        self.paragraphs = self.data.split('\n\n')
        self.lines = self.data.split('\n')
        self.words = self.data.split(  )
    def __repr__(self):
        return self.data
    def paragraph(self, index):
        return FileStrings(data=self.paragraphs[index])
    def line(self, index):
        return FileStrings(data=self.lines[index])
    def word(self, index):
        return self.words[index]

This solution, when applied to the file pepper.txt, gives:

>>> from FileStrings import FileStrings
>>> bigtext = FileStrings('pepper.txt')
>>> print bigtext.paragraph(0)
This is a paragraph that mentions bell peppers multiple times.  For
one, here is a red Pepper and dried tomato salad recipe.  I don't like
to use green peppers in my salads as much because they have a harsher
flavor.
>>> print bigtext.line(0)
This is a paragraph that mentions bell peppers multiple times.  For
>>> print bigtext.line(-4)
aren't peppers, they're chilies, but would you rather have a good cook
>>> print bigtext.word(-4)
botanist

How does it work? The constructor simply reads all the file into a big string (the instance attribute data) and then splits it according to the various criteria, keeping the results of the splits in instance attributes that are lists of strings. When returning from one of the accessor methods, the data itself is wrapped in a FileStrings object. This isn't required by the assignment, but it's nice because it means you can chain the operations, so that to find out what the last word of the third line of the third paragraph is, you can just write:

>>> print bigtext.paragraph(2).line(2).word(-1)
'cook'

Describing a directory. There are several solutions to this exercise, naturally. One simple solution is:

import os, sys, stat

def describedir(start):
    def describedir_helper(arg, dirname, files):
        """ Helper function for describing directories """
        print "Directory %s has files:" % dirname
        for file in files:
            # Find the full path to the file (directory + filename).
            fullname = os.path.join(dirname, file)
            if os.path.isdir(fullname):
                # If it's a directory, say so; no need to find the size.
                print '  '+ file + ' (subdir)' 
            else: 
                # Find out the size and print the info.
                size = os.stat(fullname)[stat.ST_SIZE]
                print '  '+file+' size='  + `size`

    # Start the 'walk'.
    os.path.walk(start, describedir_helper, None)

which uses the walk function in the os.path module, and works just fine:

>>> import describedir
>>> describedir.describedir2('testdir')
Directory testdir has files:
  describedir.py size=939
  subdir1 (subdir)
  subdir2 (subdir)
Directory testdir\subdir1 has files:
  makezeros.py size=125
  subdir3 (subdir)
Directory testdir\subdir1\subdir3 has files:
Directory testdir\subdir2 has files:

Note that you could have found the size of the files by doing len(open(fullname, 'rb').read( )), but this works only when you have read access to all the files and is quite inefficient. The stat call in the os module gives out all kinds of useful information in a tuple, and the stat module defines some names that make it unnecessary to remember the order of the elements in that tuple. See the Library Reference for details.

Modifying the prompt. The key to this exercise is to remember that the ps1 and ps2 attributes of the sys module can be anything, including a class instance with a __repr__ or __str__ method. For example:

import sys, os
class MyPrompt:
    def __init__(self, subprompt='>>> '):
        self.lineno = 0
        self.subprompt = subprompt
    def __repr__(self):
        self.lineno = self.lineno + 1
        return os.getcwd(  )+'|%d'%(self.lineno)+self.subprompt

sys.ps1 = MyPrompt(  )
sys.ps2 = MyPrompt('... ')

This code works as shown (use the -i option of the Python interpreter to make sure your program starts right away):

h:\David\book> python -i modifyprompt.py
h:\David\book|1>>> x = 3
h:\David\book|2>>> y = 3
h:\David\book|3>>> def foo(  ):
h:\David\book|3...   x = 3                # The secondary prompt is supported.
h:\David\book|3...
h:\David\book|4>>> import os
h:\David\book|5>>> os.chdir('..')
h:\David|6>>>                             # Note that the prompt changed!

Writing a simple shell. Mostly, the following script, which implements the Unix set of commands (well, some of them) should be self-explanatory. Note that we've only put a "help" message for the ls command, but there should be one for all the other commands as well:

import cmd, os, sys, shutil

class UnixShell(cmd.Cmd):
    def do_EOF(self, line):
        """ The do_EOF command is called when the user presses Ctrl-D (unix)
            or Ctrl-Z (PC). """
        sys.exit(  )

    def help_ls(self):
        print "ls <directory>: list the contents of the specified directory"
        print "                (current directory used by default)"
        
    def do_ls(self, line):
        # 'ls' by itself means 'list current directory'
        if line == '': dirs = [os.curdir]
        else: dirs = line.split(  )
        for dirname in dirs:
            print 'Listing of %s:' % dirname
            print '\n'.join(os.listdir(dirname)

    def do_cd(self, dirname):
        # 'cd' by itself means 'go home'.
        if dirname == '': dirname = os.environ['HOME']
        os.chdir(dirname)

    def do_mkdir(self, dirname):
        os.mkdir(dirname)

    def do_cp(self, line):
        words = line.split(  )
        sourcefiles,target = words[:-1], words[-1] # target could be a dir
        for sourcefile in sourcefiles:
            shutil.copy(sourcefile, target)

    def do_mv(self, line):
        source, target = line.split(  )
        os.rename(source, target)

    def do_rm(self, line):
        [os.remove(arg) for arg in line.split(  )]

class DirectoryPrompt:
    def __repr__(self):
        return os.getcwd(  )+'> '

cmd.PROMPT = DirectoryPrompt(  )
shell = UnixShell(  )
shell.cmdloop(  )

Note that we've reused the same trick as in exercise 5 of Chapter 8 to have a prompt that adjusts with the current directory, combined with the trick of modifying the attribute PROMPT in the cmd module itself. Of course those weren't part of the assignment, but it's hard to just limit oneself to a simple thing when a full-featured one will do. It works, too!

h:\David\book> python -i shell.py
h:\David\book> cd ../tmp
h:\David\tmp> ls
Listing of .:
api
ERREUR.DOC
ext
giant_~1.jpg
icons
index.html
lib
pythlp.hhc
pythlp.hhk
ref
tut
h:\David\tmp> cd ..
h:\David> cd tmp
h:\David\tmp> cp index.html backup.html
h:\David\tmp> rm backup.html
h:\David\tmp> ^Z

Of course, to be truly useful, this script needs a lot of error checking and many more features, all of which is left, as math textbooks say, as an exercise for the reader.

Redirecting stdout. This is simple: all you have to do is to replace the first line with:

import fileinput, sys                       # No change here
sys.stdout = open(sys.argv[-1], 'w')        # Open the output file.
del sys.argv[-1]                            # We've dealt with this argument.
...                                         # Continue as before.

B.8.2 Chapter 28, Frameworks

See Section 28.5 for the exercises.

Faking the Web. What you need to do is to create instances of a class that has the fieldnames attribute and appropriate instance variables. One possible solution is:

class FormData:
    def __init__(self, dict):
        for k, v in dict.items(  ):
            setattr(self, k, v)
class FeedbackData(FormData):
    """ A FormData generated by the comment.html form. """
    fieldnames = ('name', 'address', 'email', 'type', 'text')
    def __repr__(self):
        return "%(type)s from %(name)s on %(time)s" % vars(self)

fake_entries = [
    {'name': "John Doe",
     'address': '500 Main St., SF CA 94133',
     'email': 'john@sf.org',
     'type': 'comment',
     'text': 'Great toothpaste!'},
    {'name': "Suzy Doe",
     'address': '500 Main St., SF CA 94133',
     'email': 'suzy@sf.org',
     'type': 'complaint',
     'text': "It doesn't taste good when I kiss John!"},
    ]

DIRECTORY = r'C:\complaintdir'
if __name__ == '__main__':
    import tempfile, pickle, time
    tempfile.tempdir = DIRECTORY
    for fake_entry in fake_entries:
        data = FeedbackData(fake_entry)
        filename = tempfile.mktemp(  )
        data.time = time.asctime(time.localtime(time.time(  )))
        pickle.dump(data, open(filename, 'w'))

As you can see, the only thing you really had to change was the way the constructor for FormData works, since it has to do the setting of attributes from a dictionary as opposed to a FieldStorage object.

Cleaning up. There are many ways to deal with this problem. One easy one is to modify the formletter.py program to keep a list of the filenames that it has already processed (in a pickled file, of course!). This can be done by modifying the if __main__ == '__name__' test to read something like this (new lines are in bold):

if __name__ == '__main__':
    import os, pickle
    CACHEFILE = 'C:\cache.pik'
    from feedback import DIRECTORY#, FormData, FeedbackData
    if os.path.exists(CACHEFILE):
        processed_files = pickle.load(open(CACHEFILE))
    else:
        processed_files = [  ]
    for filename in os.listdir(DIRECTORY):
        if filename in processed_files: continue  # Skip this filename.
        processed_files.append(filename)
        data = pickle.load(open(os.path.join(DIRECTORY, filename)))
        if data.type == 'complaint':
            print "Printing letter for %(name)s." % vars(data)
            print_formletter(data)
        else:
            print "Got comment from %(name)s, skipping printing." % \
                  vars(data)
    pickle.dump(processed_file, open(CACHEFILE, 'w')

As you can tell, you simply load a list of the previous filenames if it exists (and use an empty list otherwise) and compare the filenames with entries in the list to determine which to skip. If you don't skip one, it needs to be added to the list. Finally, at program exit, pickle the new list.

Adding parametric plotting to grapher.py. This exercise is quite simple, as all that's needed is to change the drawing code in the Chart class. Specifically, the code between xmin, xmax = 0, N-1 and graphics.fillPolygon(...) should be placed in an if test, so that the new code reads:

if not hasattr(self.data[0], '__len__'):   # It's probably a number (1D).
       xmin, xmax = 0, N-1
# Code from existing program, up to graphics.fillPolygon(xs, ys, len(xs))
elif len(self.data[0]) == 2:               # we'll only deal with 2-D
       xmin = reduce(min, map(lambda d: d[0], self.data))
       xmax = reduce(max, map(lambda d: d[0], self.data))

       ymin = reduce(min, map(lambda d: d[1], self.data))
       ymax = reduce(max, map(lambda d: d[1], self.data))

       zero_y = y_offset - int(-ymin/(ymax-ymin)*height)
       zero_x = x_offset + int(-xmin/(xmax-xmin)*width)

       for i in range(N):
           xs[i] = x_offset + int((self.data[i][0]-xmin)/(xmax-xmin)*width)
           ys[i] = y_offset - int((self.data[i][1]-ymin)/(ymax-ymin)*height)
       graphics.color = self.color
       if self.style == "Line":
           graphics.drawPolyline(xs, ys, len(xs))
       else:
           xs.append(xs[0]); ys.append(ys[0])
           graphics.fillPolygon(xs, ys, len(xs))

[ Team LiB ]