Lab 9: Dictionaries, Object-Oriented Programming, and Writing Files

Goals

After the lab, you should be proficient at

  1. creating and testing your own classes from a specification
  2. creating your own class
  3. developing a larger program using a class to solve a problem and create a file that is used by another program

Review Lab Lecture Slides

Linux

As usual, create a directory for the programs and output you develop in this lab.

You will need to copy all the files from /home/courses/cs111/handouts/lab9 into the directory you created.

How Big Is My Program? (5 pts)

You've been writing code for 9 weeks (or so), and the programs have slowly gotten larger. Last week's Deal or No Deal program may have been your biggest yet.

Objective: Programming in Python

  1. (10) Using a dictionary object, create a program that maps a letter to an example word that starts with that letter. You must have at least three entries in your dictionary. Then, print out the dictionary so that it looks similar to a children's book, and the keys are printed in alphabetical order. Example output looks like:
    f is for fiddle
    g is for goose
    z is for zoo
    
  2. (20) The most common last names for people in the United States are Smith, Johnson, and Williams. The most common first names for females are Mary, Patricia, and Linda and for males are James, John, and Robert. (Source: Name Statistics) In this program, you are going to determine the most common names for W & L students.

    Your program will read in a text file of names (one name per line), count how many times each name occurs in the text file, and print out the names (in alphabetical order) and the number of times each name occurs.

    There are four data files in the names_data directory for you to process. The files represent W&L undergrad's last names, first names, female first names, and male first names.

    You don't need to save the output from this program. Just look at it yourself and verify that it makes sense.

    While this is useful output, we can't easily determine the name that occurs most frequently, and we can't see trends in names' frequencies.

  3. (25) The reason we can't get the output we want from the previous program is because we can't tie the name and the number of occurrences together. We want to be able to sort by number of occurrences, but, given the number of occurrences, we can't look up the name!

    To address this issue, create a FrequencyObject class. (Create the class in a file called freqobj.py) The following specifies the class's attributes and methods.

    Data:

    Functionality:

    Reminder of Development Process:

    You may want to review the Card and Deck classes.

    1. Identify the need for a class. (Done!)
    2. Define the class
    3. Define the constructor, which intializes the attributes/data the object needs.
    4. Define the __str__ method
    5. Test the methods you implemented so far.
    6. Define and test the remainder of the methods (in order of above)

    Testing: Write a function that tests your class's methods, similar to the function that tested the Card class's functionality.

    Additional Functionality: Add the following method to the implementation of FrequencyObject:

    def __cmp__(self, other):
       """Compares this object with another object.  Used in a sort
        method."""
       return  cmp(self.count, other.count)
    

    We'll talk about this method more later. Briefly, this method will make it easy for us to sort FrequencyObjects by their count.

    Make sure the indentation of the method within your code is correct.

  4. (20) Putting it all together.

    This exercise illustrates how you can use Python to generate data files to be used with other applications. You will use your FrequencyObject class to print out the name frequency results into a file that can be used by the Unix utility gnuplot. Gnuplot allows us to display our results graphically. See below for more information about Gnuplot.

    Copy the second program for this problem. Modify the program so that the dictionary maps the key (the name) to the value (the FrequencyObject). When you see a name again, update the FrequencyObject's count.

    After you're done processing the file:

    1. Get the values from the dictionary, which should be a list of FrequencyObjects.
    2. Sort the list, then reverse it so that they're in the order of greatest to least
    3. Write the list to a file, one at a time, in the following format:
      # <name>
      <index or x-coordinate> <count>
      

      For example, an output file for male first names would look like

      # James
      0 12
      # John
      1 9
      # Robert
      2 7
      # ...
      

      The data above is not the correct values for the W&L data.

    4. Save your output files for each input file in the names_data directory. You should name the summary files appropriately, such as female_fnames_freq.dat. (You can name the files either in your program or using Linux commands.)

    Gnuplot

    Now, use a program called gnuplot to draw bar graphs of the data you generated from your programs.

    Since you are dealing with text files for the data files and the plot files, it's easiest to use the jedit text editor.

    Data Files

    A typical Gnuplot data file consists of lines of text, where each line has two numbers, representing an x-value and a y-value. Here is a Gnuplot data file called "bars.dat", followed by an explanation of its contents:

    # number of days in each month of 2010
    1 31
    2 28
    3 31
    4 30
    5 31
    6 30
    7 31
    8 31
    9 30
    10 31
    11 30
    12 31
    

    Explanation:

    Plot Files

    To plot the "bars.dat" data file, you use a file that contains Gnuplot commands. Here is an example file "bars.plot" that takes "bars.dat" as input and produces an output file "bars.png". The graphic has an xrange of 0 to 13 so that all 12 months will appear and a yrange of 0 to 32. The "plot" command says to use "bars.dat" as the input file and plot the first column (1) as the x-value and the second column (2) as the y-value. The actual image produced appears after the listing of bars.plot:

    set terminal png large
    # Modify to change the output file
    set output "bars.png"
    set data style boxes 
    set boxwidth 0.4 
    set xtics nomirror
    set border 11
    
    # Modify this code to set the x-range
    set xrange [0:13]
    
    # Modify this line to set the y-range
    set yrange [0:32]
    
    set xlabel "Months" 
    set ylabel "Days in Month"
    
    set xtics ("Jan" 1, "Feb" 2, "Mar" 3, "Apr" 4, "May" 5, "June" 6,\
     "July" 7, "Aug" 8, "Sep" 9, "Oct" 10, "Nov" 11, "Dec" 12) 
    
    set key below
    
    plot 'bars.dat' using 1:2 fs solid title "Num Days"
    
    

    Executing Gnuplot

    To execute Gnuplot, in the terminal run gnuplot <plotfile>

    For the above example, you would execute gnuplot bars.plot

    There should be no output (gnuplot is a bad parent), but you should now see the output file in the directory if you run ls.

    Example Gnuplot Plot File for Our Work

    set terminal png large
    # Modify to change the output file name
    set output "test.png"
    set data style boxes 
    set boxwidth 0.4 
    set xtics nomirror
    set border 11
    
    # Modify this code to set the x-range
    set xrange [-0.5:4.5]
    
    # Uncomment, modify this line to set the y-range
    #set yrange [0:32]
    
    set xlabel "Name" 
    set ylabel "Number at W and L"
    
    # get the x-axis labels from the comments in the .dat file
    set xtics ("James" 0, "John" 1, "Robert" 2)
    
    set key below
    
    plot 'test.dat' using 1:2 fs solid notitle
    

    To create your own graph file, you will probably need to modify

    You can modify the file in jEdit.

    Your graph should show the results for the five most popular names. You will generate four graphs: female first names, male first names, first names, and last names.

    Note: The five most popular names should be the first five names in your .dat file. You can graph the first 6 names for the last names.

    You should put the graphs in your names_data directory so that it doesn't mess up your printing later.

    Objective: Creating a New Web Page

    1. Go into your public_html directory.
    2. Copy your lab5.html or index.html file into a file called lab9.html (in the public_html directory). See a student assistant or the instructor if you've had trouble with this in the past.
    3. Copy the graphs you created in the last part into the public_html directory.
    4. Modify the Lab 9 web page to have an appropriate, title, header, and information.
    5. Modify your Lab 9 web page to display the graphs you created.
    6. Add text labeling each graph separately.
    7. Note: you should not display the "old" images from the original index.html or lab5.html pages. Your page should only contain content about this week's lab.
    8. Modify your index.html page to link to your Lab 9 web page.

    Extra Credit (8 pts)

    A student wrote a game that would be much easier to write now that we know lists. Modify this program to use lists instead of a bunch of variables. The challenges are to understand what the program is doing (appreciate comments more?) and then modify it so that it use lists and still gives the correct results. You'll need your game module for the program to work.

    Finishing up: What to turn in for this lab

    1. IDLE and jEdit may create backup files with the "~" extension. Delete these files from your lab directory to save paper when you print.
    2. Copy your lab9 directory into the turnin directory. (Review the UNIX handout if you don't remember how to do that.)
    3. Remove any .pyc files and any graph files (*.png) files from the directory; otherwise, you'll get an error when creating the output file.
    4. Use the printLab.sh command to create a file to print out. You should probably print from the labs directory.
    5. You can always view the output before you print it, using the gv command, such as
      gv lab9.ps

      Print the file using the lpr command introduced in the first lab.

    Labs are due at the beginning of Friday's class. You should hand in the printed copy at the beginning of class, and the electronic version should be in the turnin directory before 1:25 p.m. on Friday.

    Ask well before the deadline if you need help turning in your assignment!

    Grading (100 pts)