Extra Credit: Learning Java's Regular Expression Package
Objectives:
- To learn a useful Java API on your own
- To figure out what information you need (and don't need) to solve a problem.
- To learn about regular expressions, which can be really powerful when parsing or validating data (and you'll talk about them in CSCI312 and CSCI313, if you take them). Need more convincing? Check out this XKCD comic.
Due: November 18 before class.
Learning Regular Expressions
Java provides a tutorial about regular expressions in Java.
This regular expression tester for Java may be useful when you're testing/validating regular expressions.
Creating Your Own Regular Expressions
(Up to 7 points) Phone number validator
Write a program that takes as input a file of possible phone
numbers, each on its own line, and displays the phone number and if
the phone number is valid. The format of the phone number should
be (###) ###-####
.
To get you started, here is an example file of phone numbers, some of which are valid.
Create JUnit test cases that check the correctness of your regular expressions, without the added complexity of reading from a file to test.
Demonstrate that your validator works when reading from a file. Save the output in a file.
(Up to 8 points) Variable name validator
Create a small tutorial for introductory computer science students in how to name variables in Python. (Do you remember the naming rules?) For the purposes of this assignment, you don't need to check that variables names are not reserved words.
Your program should take as input a file of possible variable names, each on its own line, and display the variable name and if the variable name is valid.
Demonstrate that your validator works using JUnit test cases and reading in from a text file. Save your output from reading in the text file.
(Up to 15 points) Email address validator
Validating email addresses is a (unexpectedly?) hard problem.
Create a regular expression to validate email addresses. Create an input file with potential email addresses and whether the email address is actually valid. Using your validator, count how many email addresses were
- correctly identified as email addresses
- correctly identified as not email addresses
- incorrectly identified as email addresses (false positives)
- incorrectly identified as not email addresses (false negatives)
In comments, discuss the accuracy of your regular expression. Is it "good enough"? Under what circumstances is it not good enough? Where would you most like to see it improve?
You can create several regular expressions and compare them and their tradeoffs in accuracy.
Demonstrate that your code works and save the output in a file.
Submission
Create a jar file from your project.
Copy your jar file into a ec_regex
directory in your turnin
directory. Email
me to let me know that you submitted the assignment.
Grading (Up to 30 pts extra credit on lab grade)
You will be evaluated based on the following criteria:
- Correctness of each program
- Style, organization, evidence of testing