See Section 27.9
for the exercises.
Avoiding regular expressions. This program is
long and tedious, but not especially complicated. See if you can
understand how it works. Whether this is easier for you than regular
expressions depends on many factors, such as your familiarity with
regular expressions and your comfort with the functions in the
string module. Use whichever type of programming
works for you. file = open('pepper.txt')
text = file.read( )
paragraphs = text.split('\n\n')
def find_indices_for(big, small):
indices = [ ]
cum = 0
while 1:
index = big.find(small)
if index == -1:
return indices
indices.append(index+cum)
big = big[index+len(small):]
cum = cum + index + len(small)
def fix_paragraphs_with_word(paragraphs, word):
lenword = len(word)
for par_no in range(len(paragraphs)):
p = paragraphs[par_no]
wordpositions = find_indices_for(p, word)
if wordpositions == [ ]: return
for start in wordpositions:
# Look for 'pepper' ahead.
indexpepper = p.find('pepper')
if indexpepper == -1: return -1
if p[start:indexpepper].strip( ):
# Something other than whitespace in between!
continue
where = indexpepper+len('pepper')
if p[where:where+len('corn')] == 'corn':
# It's immediately followed by 'corn'!
continue
if p.find('salad') < where:
# It's not followed by 'salad'.
continue
# Finally! We get to do a change!
p = p[:start] + 'bell' + p[start+lenword:]
paragraphs[par_no] = p # Change mutable argument!
fix_paragraphs_with_word(paragraphs, 'red')
fix_paragraphs_with_word(paragraphs, 'green')
for paragraph in paragraphs:
print paragraph+'\n' We won't repeat the output here;
it's the same as that of the regular expression
solution. Wrapping a text file with a class. This one is
surprisingly easy, if you understand classes and the
split function in the string
module. The following is a version that has one little twist over and
beyond what we asked for: class FileStrings:
def __init__(self, filename=None, data=None):
if data == None:
self.data = open(filename).read( )
else:
self.data = data
self.paragraphs = self.data.split('\n\n')
self.lines = self.data.split('\n')
self.words = self.data.split( )
def __repr__(self):
return self.data
def paragraph(self, index):
return FileStrings(data=self.paragraphs[index])
def line(self, index):
return FileStrings(data=self.lines[index])
def word(self, index):
return self.words[index] This solution, when applied to the file
pepper.txt, gives: >>> from FileStrings import FileStrings
>>> bigtext = FileStrings('pepper.txt')
>>> print bigtext.paragraph(0)
This is a paragraph that mentions bell peppers multiple times. For
one, here is a red Pepper and dried tomato salad recipe. I don't like
to use green peppers in my salads as much because they have a harsher
flavor.
>>> print bigtext.line(0)
This is a paragraph that mentions bell peppers multiple times. For
>>> print bigtext.line(-4)
aren't peppers, they're chilies, but would you rather have a good cook
>>> print bigtext.word(-4)
botanist How does it work? The constructor simply reads all the file into a
big string (the instance attribute data) and then
splits it according to the various criteria, keeping the results of
the splits in instance attributes that are lists of strings. When
returning from one of the accessor methods, the data itself is
wrapped in a FileStrings object. This
isn't required by the assignment, but
it's nice because it means you can chain the
operations, so that to find out what the last word of the third line
of the third paragraph is, you can just write: >>> print bigtext.paragraph(2).line(2).word(-1)
'cook' Describing a directory. There are several
solutions to this exercise, naturally. One simple solution is: import os, sys, stat
def describedir(start):
def describedir_helper(arg, dirname, files):
""" Helper function for describing directories """
print "Directory %s has files:" % dirname
for file in files:
# Find the full path to the file (directory + filename).
fullname = os.path.join(dirname, file)
if os.path.isdir(fullname):
# If it's a directory, say so; no need to find the size.
print ' '+ file + ' (subdir)'
else:
# Find out the size and print the info.
size = os.stat(fullname)[stat.ST_SIZE]
print ' '+file+' size=' + `size`
# Start the 'walk'.
os.path.walk(start, describedir_helper, None) which uses the walk function in the
os.path module, and works just fine: >>> import describedir
>>> describedir.describedir2('testdir')
Directory testdir has files:
describedir.py size=939
subdir1 (subdir)
subdir2 (subdir)
Directory testdir\subdir1 has files:
makezeros.py size=125
subdir3 (subdir)
Directory testdir\subdir1\subdir3 has files:
Directory testdir\subdir2 has files: Note that you could have found the size of the files by doing
len(open(fullname, 'rb').read( )), but this works
only when you have read access to all the files and is quite
inefficient. The stat call in the
os module gives out all kinds of useful
information in a tuple, and the stat module
defines some names that make it unnecessary to remember the order of
the elements in that tuple. See the Library
Reference for details. Modifying the prompt. The key to this exercise
is to remember that the ps1 and
ps2 attributes of the sys
module can be anything, including a class instance with a __repr__ or __str__ method.
For example: import sys, os
class MyPrompt:
def __init__(self, subprompt='>>> '):
self.lineno = 0
self.subprompt = subprompt
def __repr__(self):
self.lineno = self.lineno + 1
return os.getcwd( )+'|%d'%(self.lineno)+self.subprompt
sys.ps1 = MyPrompt( )
sys.ps2 = MyPrompt('... ') This code works as shown (use the -i option of the
Python interpreter to make sure your program starts right away): h:\David\book> python -i modifyprompt.py
h:\David\book|1>>> x = 3
h:\David\book|2>>> y = 3
h:\David\book|3>>> def foo( ):
h:\David\book|3... x = 3 # The secondary prompt is supported.
h:\David\book|3...
h:\David\book|4>>> import os
h:\David\book|5>>> os.chdir('..')
h:\David|6>>> # Note that the prompt changed! Writing a simple shell. Mostly, the following
script, which implements the Unix set of commands (well, some of
them) should be self-explanatory. Note that we've
only put a "help" message for the
ls command, but there should be one for all the
other commands as well: import cmd, os, sys, shutil
class UnixShell(cmd.Cmd):
def do_EOF(self, line):
""" The do_EOF command is called when the user presses Ctrl-D (unix)
or Ctrl-Z (PC). """
sys.exit( )
def help_ls(self):
print "ls <directory>: list the contents of the specified directory"
print " (current directory used by default)"
def do_ls(self, line):
# 'ls' by itself means 'list current directory'
if line == '': dirs = [os.curdir]
else: dirs = line.split( )
for dirname in dirs:
print 'Listing of %s:' % dirname
print '\n'.join(os.listdir(dirname)
def do_cd(self, dirname):
# 'cd' by itself means 'go home'.
if dirname == '': dirname = os.environ['HOME']
os.chdir(dirname)
def do_mkdir(self, dirname):
os.mkdir(dirname)
def do_cp(self, line):
words = line.split( )
sourcefiles,target = words[:-1], words[-1] # target could be a dir
for sourcefile in sourcefiles:
shutil.copy(sourcefile, target)
def do_mv(self, line):
source, target = line.split( )
os.rename(source, target)
def do_rm(self, line):
[os.remove(arg) for arg in line.split( )]
class DirectoryPrompt:
def __repr__(self):
return os.getcwd( )+'> '
cmd.PROMPT = DirectoryPrompt( )
shell = UnixShell( )
shell.cmdloop( ) Note that we've reused the same trick as in exercise
5 of Chapter 8 to have a prompt that adjusts
with the current directory, combined with the trick of modifying the
attribute PROMPT in the cmd
module itself. Of course those weren't part of the
assignment, but it's hard to just limit oneself to a
simple thing when a full-featured one will do. It works, too! h:\David\book> python -i shell.py
h:\David\book> cd ../tmp
h:\David\tmp> ls
Listing of .:
api
ERREUR.DOC
ext
giant_~1.jpg
icons
index.html
lib
pythlp.hhc
pythlp.hhk
ref
tut
h:\David\tmp> cd ..
h:\David> cd tmp
h:\David\tmp> cp index.html backup.html
h:\David\tmp> rm backup.html
h:\David\tmp> ^Z Of course, to be truly useful, this script needs a lot of error
checking and many more features, all of which is left, as math
textbooks say, as an exercise for the reader. Redirecting stdout. This is simple: all you have
to do is to replace the first line with: import fileinput, sys # No change here
sys.stdout = open(sys.argv[-1], 'w') # Open the output file.
del sys.argv[-1] # We've dealt with this argument.
... # Continue as before.
See Section 28.5 for the exercises.
Faking the Web. What you need to do is to create
instances of a class that has the fieldnames attribute and
appropriate instance variables. One possible solution is: class FormData:
def __init__(self, dict):
for k, v in dict.items( ):
setattr(self, k, v)
class FeedbackData(FormData):
""" A FormData generated by the comment.html form. """
fieldnames = ('name', 'address', 'email', 'type', 'text')
def __repr__(self):
return "%(type)s from %(name)s on %(time)s" % vars(self)
fake_entries = [
{'name': "John Doe",
'address': '500 Main St., SF CA 94133',
'email': 'john@sf.org',
'type': 'comment',
'text': 'Great toothpaste!'},
{'name': "Suzy Doe",
'address': '500 Main St., SF CA 94133',
'email': 'suzy@sf.org',
'type': 'complaint',
'text': "It doesn't taste good when I kiss John!"},
]
DIRECTORY = r'C:\complaintdir'
if __name__ == '__main__':
import tempfile, pickle, time
tempfile.tempdir = DIRECTORY
for fake_entry in fake_entries:
data = FeedbackData(fake_entry)
filename = tempfile.mktemp( )
data.time = time.asctime(time.localtime(time.time( )))
pickle.dump(data, open(filename, 'w')) As you can see, the only thing you really had to change was the way
the constructor for FormData works, since it has
to do the setting of attributes from a dictionary as opposed to a
FieldStorage object. Cleaning up. There are many ways to deal with
this problem. One easy one is to modify the
formletter.py program to keep a list of the
filenames that it has already processed (in a pickled file, of
course!). This can be done by modifying the if __main__ == '__name__' test to read
something like this (new lines are in bold): if __name__ == '__main__':
import os, pickle
CACHEFILE = 'C:\cache.pik'
from feedback import DIRECTORY#, FormData, FeedbackData
if os.path.exists(CACHEFILE):
processed_files = pickle.load(open(CACHEFILE))
else:
processed_files = [ ]
for filename in os.listdir(DIRECTORY):
if filename in processed_files: continue # Skip this filename.
processed_files.append(filename)
data = pickle.load(open(os.path.join(DIRECTORY, filename)))
if data.type == 'complaint':
print "Printing letter for %(name)s." % vars(data)
print_formletter(data)
else:
print "Got comment from %(name)s, skipping printing." % \
vars(data)
pickle.dump(processed_file, open(CACHEFILE, 'w') As you can tell, you simply load a list of the previous filenames if
it exists (and use an empty list otherwise) and compare the filenames
with entries in the list to determine which to skip. If you
don't skip one, it needs to be added to the list.
Finally, at program exit, pickle the new list. Adding parametric plotting to grapher.py. This
exercise is quite simple, as all that's needed is to
change the drawing code in the Chart class. Specifically, the code
between xmin, xmax = 0, N-1 and
graphics.fillPolygon(...) should be placed in an
if test, so that the new code reads: if not hasattr(self.data[0], '__len__'): # It's probably a number (1D).
xmin, xmax = 0, N-1
# Code from existing program, up to graphics.fillPolygon(xs, ys, len(xs))
elif len(self.data[0]) == 2: # we'll only deal with 2-D
xmin = reduce(min, map(lambda d: d[0], self.data))
xmax = reduce(max, map(lambda d: d[0], self.data))
ymin = reduce(min, map(lambda d: d[1], self.data))
ymax = reduce(max, map(lambda d: d[1], self.data))
zero_y = y_offset - int(-ymin/(ymax-ymin)*height)
zero_x = x_offset + int(-xmin/(xmax-xmin)*width)
for i in range(N):
xs[i] = x_offset + int((self.data[i][0]-xmin)/(xmax-xmin)*width)
ys[i] = y_offset - int((self.data[i][1]-ymin)/(ymax-ymin)*height)
graphics.color = self.color
if self.style == "Line":
graphics.drawPolyline(xs, ys, len(xs))
else:
xs.append(xs[0]); ys.append(ys[0])
graphics.fillPolygon(xs, ys, len(xs))
|