After the lab, you should be proficient at
You will need to copy all the files
from /home/courses/cs111/handouts/lab9
into the directory
you created.
You've been writing code for 9 weeks (or so), and the programs have slowly gotten larger. Last week's Deal or No Deal program may have been your biggest yet.
lab9
directory, execute the command:
wc ../lab8/*.py
wc
stands for "Word Count". We called
the "word count" command on all the *.py files (i.e., the Python
scripts, not other files or directories) in
your lab8
directory. The first column of
output is the number of lines, the second is the number of words, and
the third is the number of characters in each file. The total for all
the files is listed in the last line of output.
For example:
16 69 459 ../lab8/lab8.1.py 18 43 383 ../lab8/lab8.2.py 9 35 243 ../lab8/lab8.3.py 248 911 7391 ../lab8/lab8.4.py 291 1058 8476 total
wc
command so that you only see
the number of lines, using the command-line argument
"-l
", i.e.,
wc -l ../lab8/*.py
>
operator, i.e.,
wc -l ../lab8/*.py > mywc.txt
mywc.txt
file, using the Unix commands
more
or cat
or
opening the file in a text editor, such as jEdit.
f is for fiddle g is for goose z is for zoo
Your program will read in a text file of names (one name per line), count how many times each name occurs in the text file, and print out the names (in alphabetical order) and the number of times each name occurs.
There are four data files in the names_data
directory for you to process. The files represent W&L undergrad's
last names, first names, female first names, and male first
names.
While this is useful output, we can't easily determine the name that occurs most frequently, and we can't see trends in names' frequencies.
You don't need to save the output from this program. Just look at it yourself and verify that it makes sense.
To address this issue, create a FrequencyObject
class. (Create the class in a file
called freqobj.py
) The following
specifies the class's attributes and methods.
Data:
FrequencyObject
's
key FrequencyObject
occurredFunctionality:
getKey()
- returns the FrequencyObject's keygetCount()
- returns the FrequencyObject's countupdateCount()
- increments the FrequencyObject's count by 1Testing: Write a function that tests your class's methods, similar to the function that tested the Card class's functionality.
Additional Functionality: Add the following method
to the implementation of FrequencyObject
:
def __cmp__(self, other): """Compares this object with another object. Used in a sort method.""" return cmp(self.count, other.count)
We'll talk about this method more later this week. Briefly, this method will make it easy for us to sort FrequencyObjects by their count.
Make sure the indentation of the method within your code is correct.
This exercise illustrates how you can use Python to generate data
files to be used with other applications. You will use
your FrequencyObject
class to print out the name
frequency results into a file that can be used by the Unix
utility gnuplot
. Gnuplot allows us to
display our results graphically. See below for more information
about Gnuplot.
Copy the second program for this problem. Modify the program so
that the dictionary maps the key (the name) to the value
(the FrequencyObject
). When you see a name again,
update the FrequencyObject
's count.
After you're done processing the file:
values
from the dictionary, which should
be a list of FrequencyObject
s.# <name> <index or x-coordinate> <count>
For example, an output file for male first names would look like
# James 0 12 # John 1 9 # Robert 2 7 # ...
The data above is not the correct values for the W&L data.
names_data
directory. You should name
the summary files appropriately, such as
female_fnames_freq.dat
. (You can name
the files either in your program or using Linux
commands.) Now, use a program called gnuplot
to draw
bar graphs of the data you generated from your programs.
Since you are dealing with text files for the data files and the
plot files, it's easiest to use the jedit
text
editor.
A typical Gnuplot data file consists of lines of text, where each line has two numbers, representing an x-value and a y-value. Here is a Gnuplot data file called "bars.dat", followed by an explanation of its contents:
# number of days in each month of 2010 1 31 2 28 3 31 4 30 5 31 6 30 7 31 8 31 9 30 10 31 11 30 12 31
Explanation:
To plot the "bars.dat" data file, you use a file that contains Gnuplot commands. Here is an example file "bars.plot" that takes "bars.dat" as input and produces an output file "bars.png". The graphic has an xrange of 0 to 13 so that all 12 months will appear and a yrange of 0 to 32. The "plot" command says to use "bars.dat" as the input file and plot the first column (1) as the x-value and the second column (2) as the y-value. The actual image produced appears after the listing of bars.plot:
set terminal png large # Modify to change the output file set output "bars.png" set data style boxes set boxwidth 0.4 set xtics nomirror set border 11 # Modify this code to set the x-range set xrange [0:13] # Modify this line to set the y-range set yrange [0:32] set xlabel "Months" set ylabel "Days in Month" set xtics ("Jan" 1, "Feb" 2, "Mar" 3, "Apr" 4, "May" 5, "June" 6,\ "July" 7, "Aug" 8, "Sep" 9, "Oct" 10, "Nov" 11, "Dec" 12) set key below plot 'bars.dat' using 1:2 fs solid title "Num Days"
To execute Gnuplot, run gnuplot <plotfile>
For the above example, you would execute gnuplot
bars.plot
set terminal png large # Modify to change the output file name set output "test.png" set data style boxes set boxwidth 0.4 set xtics nomirror set border 11 # Modify this code to set the x-range set xrange [-1:5] # Uncomment, modify this line to set the y-range #set yrange [0:32] set xlabel "Name" set ylabel "Number at W and L" # get the x-axis labels from the comments in the .dat file set xtics ("James" 0, "John" 1, "Robert" 2) set key below plot 'test.dat' using 1:2 fs solid notitle
To create your own graph file, you will probably need to modify the output file's name, the input file's name, the x-axis and y-axis range and labels, and the xtics (x-axis labels). You can modify the file in jEdit.
Your graph should show the results for the five most popular names. You will generate four graphs: female first names, male first names, first names, and last names.
Note: The five most popular names should be the first five names in your .dat file
You should put the graphs in
your names_data
directory so that it doesn't
mess up your printing later.
public_html
directory.
lab5.html
or index.html
file into a file
called lab9.html
(in
the public_html
directory). See a
student assistant or the instructor if you've had trouble with
this in the past.public_html
directory.Last year, Camille wrote a game that would be much easier to write now that we know lists. Modify this program to use lists instead of a bunch of variables. The challenges are to understand what the program is doing (appreciate comments more?) and then modify it so that it use lists and still gives the correct results. You'll need your game module for the program to work.
turnin
directory. (Review
the UNIX handout if you don't remember how
to do that.)printLab.sh
command to create a file to
print out. You should probably print from the
labs
directory.gv
command, such as
gv lab9.ps
Print the file using the lpr
command
introduced in the first lab.
Labs are due at the beginning of Friday's class. You should
hand in the printed copy at the beginning of class, and the electronic
version should be in the turnin
directory before 1:25 p.m. on Friday.
Ask well before the deadline if you need help turning in your assignment!
wc
output: 5 pts