Lab 9: Designing Scientific Applications

Goals

After the lab, you should be proficient at

  1. creating and testing your own classes from a specification
  2. developing a larger program (set of classes) to solve a scientific problem

Linux

As usual, create a directory for the programs and output you develop in this lab.

You will need to copy all the files from /home/courses/cs111/handouts/lab9 into the directory you created.

Objective: Programming in Python

For this lab, you'll use "real" program names instead of the typical "lab9.x.py" names we use.

Hints

Problem: Given the IP addresses of the clients that make requests to a web application, what is the distribution of requests from top-level domains. Show the distribution in a graphical way.

(See the lecture notes for more information about what the program should do.)

In the end, you will execute your program on two different log files and generate the data and graphs showing the data.

The following components of your program can be completed successfully in various orders. Note that the order that I write the requrirements is not necessarily the order that they should be implemented in. For example, I would write the code for getting the input filename later and first just hardcode in the test.log filename into the program. And, I would write the constructor and the string representation before I write the other methods.

Driver Program (50 pts)

At a high-level, this program should:

WebClientInfo class (Provided for you in requester.py)

Data: ip address, hostname, top-level domain

Functionality:

DomainRequests class (25 pts)

Data: top-level domain name, number of requests

Functionality:

Testing: Write a function that tests your class.

Test Results

For the file test.log, your output file should look like:
# Data format:
# <x-axis value> <num_requests>
# com
1 24
# net
2 5
# edu
3 2

Gnuplot

In this exercise, you will use a program called gnuplot to draw bar graphs. This exercise illustrates how you can use Python to generate data files to be used with other applications.

Since you are dealing with text files, it's easiest to use the jedit text editor.

Data Files

A typical gnuplot data file consists of lines of text, where each line has two numbers, representing an x-value and a y-value. Here is a gnuplot data file called "bars.dat", followed by an explanation of its contents:
# number of days in each month of 2005

1 31
2 28
3 31
4 30
5 31
6 30
7 31
8 31
9 30
10 31
11 30
12 31

Explanation:

Plot Files

To plot the "bars.dat" data file, you use a file that contains gnuplot commands. Here is an example file "bars.plot" that takes "bars.dat" as input and produces an output file "bars.png". The graphic has an xrange of 0 to 13 so that all 12 months will appear and a yrange of 0 to 32. The "plot" command says to use "bars.dat" as the input file and plot the first column (1) as the x-value and the second column (2) as the y-value. The actual image produced appears after the listing of bars.plot:
set terminal png large
# Modify to change the output file
set output "bars.png"
set data style boxes 
set boxwidth 0.4 
set xtics nomirror
set border 11

# Modify this code to set the x-range
set xrange [0:13]

# Modify this line to set the y-range
set yrange [0:32]

set xlabel "Months" 
set ylabel "Days in Month"

set xtics ("Jan" 1, "Feb" 2, "Mar" 3, "Apr" 4, "May" 5, "June" 6, "July" 7, "Aug" 8, "Sep" 9, "Oct" 10, "Nov" 11, "Dec" 12) 

set key below

plot 'bars.dat' using 1:2 fs solid title "Num Days"

Executing gnuplot

To execute gnuplot, run gnuplot <plotfile>

For the above example, you would execute gnuplot bars.plot

Example gnuplot Plot File for our Work

set terminal png large
# Modify to change the output file
set output "test.png"
set data style boxes 
set boxwidth 0.4 
set xtics nomirror
set border 11

# Modify this code to set the x-range
set xrange [0:4]

# Uncomment, modify this line to set the y-range
#set yrange [0:32]

set xlabel "Top-Level Domain" 
set ylabel "Number of Requests"

# get the x-axis labels from the comments in the .dat file
set xtics ("com" 1, "net" 2, "edu" 3)

set key below

plot 'test.dat' using 1:2 fs solid notitle

To create your own graph file, you will probably need to modify the output file's name, the x-axis, the y-axis, and the xtics (x-axis labels).

NOTE: If you have more than 10 top-level domains in your gnuplot data file, you don't need to show them; just show the first 10.

Objective: Creating a New Web Page

  1. Go into your public_html directory.
  2. Copy your lab7.html file into a file called lab9.html (in the public_html directory).
  3. Copy the graphs you created in the last part into the public_html directory.
  4. Modify the Lab 9 web page to have an appropriate, title, header, and information.
  5. Modify your Lab 9 web page to display the graphs you created.
  6. Add text describing the results for each graph, separately, stating who uses each application.
  7. Briefly, compare the two results shown in the two graphs.
  8. Note: you should not display the "old" images from the original index.html or lab7.html pages. This should just contain content about this week's lab.
  9. Modify your index.html page to link to your Lab 9 web page.

Extra Credit (up to 10 pts)

Option 1: Improve Usability of Your Program (6 pts)

Option 2: Alternative Data Aggregation (4 pts)

Create data files that summarize the number of requests from "second-level" domains. For example, we would consider requests from "*.wlu.edu" and "*.vmi.edu" separately.

To get full data, you must analyze the data and compare it with the results from the original part of lab. What additional or different information does this data tell you?

Option 3: Plot other Data files (3 pts)

Plot other log files and show on your web page.

Finishing up: What to turn in for this lab

  1. IDLE and jEdit may create backup files with the "~" extension. Delete these files from your lab directory to save paper when you print.
  2. Copy your lab9 directory into the turnin directory. (Review the UNIX handout if you don't remember how to do that.)
  3. Before printing, move the original .log files out of the lab9 directory so that you don't print them. Also, remove the .pyc and the graph files (*.png) files from the directory; otherwise, you'll get an error when creating the output file.
  4. Use the printLab.sh command to create a file to print out. You should probably print from the labs directory.
  5. You can always view the output before you print it, using the gv command, such as
    gv lab9.ps

    Print the file using the lpr command introduced in the first lab.

Labs are due at the beginning of Friday's class. You should hand in the printed copy at the beginning of class, and the electronic version should be in the turnin directory before 2:25 p.m. on Friday.

Ask well before the deadline if you need help turning in your assignment!

Grading (100 pts)