== Project ==
Learn how to plot data with the matplotlib plotting library. Ditch Excel forever!
== Goals ==
* practice reading data from a file
* practice using the matplotlib Python plotting library to analyze data and generate graphs
== Project setup ==
=== Install the project dependencies ===
Please follow the official matplotlib installation instructions at http://matplotlib.sourceforge.net/users/installing.html
The dependencies vary across operating systems. http://matplotlib.sourceforge.net/users/installing.html#build-requirements summarizes what you'll need for your operating system.
A universal dependency is the NumPy scientific computing library. NumPy has download and installation instructions at http://numpy.scipy.org/
Installing matplotlib and its dependencies is somewhat involved; please ask for help if you get stuck or don't know where to start!
==== Mac OS X users only ====
If you do not already have a C compiler installed, you'll need one to install matplotlib. You have several options depending on your situation:
# Download and install Xcode (1.5 GB) from https://developer.apple.com/xcode/
# Download and install Command Line Tools for Xcode (175 MB) from https://developer.apple.com/downloads/index.action. This requires an Apple Developer account (free, but you have to sign up).
# Download and install kennethreitz's gcc installer (requires 10.6 or 10.7) from https://github.com/kennethreitz/osx-gcc-installer/
Please wave over a staff member and we'll help you pick which option is best for you computer.
=== Download and un-archive the Jeopardy database project skeleton code ===
Un-archiving will produce a <code>JeopardyDatabase</code> folder containing 3 Python files and one SQL database dump.
=== Create a SQLite database from the database dump ===
Inside <code>JeopardyDatabase</code> is a file called <code>jeopardy.dump</code> which contains a SQL database dump. We need to turn that database dump into a SQLite database.
Once you have SQLite installed, you can create a database from jeopardy.dump with:
<pre>sqlite3 jeopardy.db < jeopardy.dump</pre>
This creates a sqlite3 database called <code>jeopardy.db</code>
=== Test your setup ===
At a command prompt, start <code>sqlite3</code> using the <code>jeopardy.db</code> database by running:
That should start a sqlite prompt that looks like this:
SQLite version 3.6.12
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
At that sqlite prompt, type <code>.tables</code> and hit enter. That should display a list of the tables in this database:
From a command prompt, navigate to the <code>JeopardyDatabase</code> directory and run
You should see a list of 10 jeopardy categories printed to the screen. If you don't, let a staff member know so you can debug this together.
== Project steps ==
=== 1. Look at the layout of the Jeopardy database ===
Start sqlite with <code>sqlite3 jeopardy.db</code>. Then look at the tables in your database by running the following commands and the sqlite prompt:
* <tt>.table</tt>, which will list the tables in the database
* <tt>.schema category</tt>, which will show the organization of the <tt>category</tt> table, including the fields and the data types they store.
It should look like this:
<pre>sqlite> .schema category
CREATE TABLE "category" (
id INTEGER PRIMARY KEY,
name VARCHAR(255) NOT NULL,
game INTEGER NOT NULL,
This tells us that the <code>category</code> table has 4 fields: <code>id</code>, <code>name</code>, <code>game</code>, and <code>boardPosition</code>.
* <tt>.schema clue</tt>
Read these short documents:
* What is SQL? http://www.w3schools.com/sql/sql_intro.asp
* What is a database schema? http://wiki.answers.com/Q/What_is_a_database_schema
<b>Check your understanding</b>:
* What tables are in the database?
* What is a schema?
* What fields are in the <code>category</code> table?
* What fields are in the <code>clue</code> table?
=== 2. Query the database with SELECT ===
Try running the following queries from the sqlite prompt:
* <tt>SELECT * FROM category;</tt>
* <tt>SELECT NAME FROM category;</tt>
* <tt>SELECT * FROM clue;</tt>
* <tt>SELECT text, answer, value FROM clue;</tt>
* <tt>SELECT text, answer, value FROM clue LIMIT 10;</tt>
Explore the <code>category</code> and <code>clue</code> tables with your own SELECT queries.
<b>Check your understanding</b>:
* What does <code>*</code> mean in the above queries?
* What does the <code>LIMIT</code> SQL keyword do?
* Does case matter when making SQL queries?
<b>Step 2 resources</b>:
Using SELECT: http://www.w3schools.com/sql/sql_select.asp
=== 3. Make database queries from Python ===
Examine the code in <code>jeopardy_categories.py</code>. To make a database query from Python, you need to:
# Import a Python library for making database connections
# Establish a connection to the desired database
# Get a cursor from the database for making queries
# Execute the database query using the standard SQL syntax
# Retrieve the list of results from the database cursor
# Do something useful with the results, like print them
# Close the database connection
Match up each of these steps with lines of code in the file.
=== 4. Tweak the existing Jeopardy scripts ===
==== 1. Modify <code>jeopardy_categories.py</code> to print both the category and game number ====
<b>tip</b>: Remind yourself of the <code>categories</code> schema by running <code>.schema category</code> at a sqlite prompt.
DETECTIVE FICTION (game #1)
THE OLD TESTAMENT (game #2)
ASIAN HISTORY (game #4)
RIVER SOURCES (game #5)
WORLD RELIGION (game #3)
SEAN SONG (game #2)
ANIMATED MOVIES (game #1)
NEW YORK CITY (game #6)
AFRICAN WILDLIFE (game #7)
LITTLE RED RIDING HOOD (game #8)</pre>
==== 2. Modify <code>jeopardy_clues.py</code> to only print clues with an $800 value. ====
A good way to achieve this is by adding a <code>WHERE</code> clause to the SQL query in <code>jeopardy_clues.py</code>.
Read about <code>WHERE</code> clauses in this short document:
* The WHERE Clause: http://www.w3schools.com/sql/sql_where.asp
A: She also created the detectives Tuppence & Tommy Beresford
Q: What is 'Agatha Christie'
A: According to this Old Testament book, this "swords into plowshares" prophet walked naked for 3 years
Q: What is 'Isaiah'
=== 5. Daily Doubles ===
Write a script that prints 10 daily doubles and their responses.
<b>tip</b>: The <code>clue</code> table has an <code>isDD</code> field.
<pre>Category: NEW YORK CITY
Question: The heart of Little Italy is this street also found in a Dr. Seuss book title
Answer: Mulberry Street
Category: RIVER SOURCES
Question: This Mideastern boundary river rises on the slopes of Mount Hermon
Answer: the Jordan
Question: The Titanic has 3 rooms for this--only men were allowed there, as women weren't supposed to do it in public
=== 1. Random category clues ===
Write a script that randomly chooses a category and prints clues from that category.
<b>tip</b>: SQL supports an "<code>ORDER BY RANDOM()</code>" clause that will return rows in a random order. For example, to randomly pick 1 category id you could use:
<pre>SELECT id FROM category ORDER BY RANDOM() LIMIT 1</pre>
You can also use <code>ORDER BY</code> to sort the clues by value.
<pre>5 GUYS NAMED MOE
[$200] Last name of Moe of the Three Stooges
[$400] Moe Strauss founded this auto parts chain along with Manny Rosenfield & Jack Jackson
[$600] Major league catcher Moe Berg was also a WWII spy for this agency, precursor of the CIA
[$800] Term for the type of country music Moe Bandy plays, the clubs where he began, or the "Queen" he sang of in 1981
[$1000] This "Kool" rapper's album "How Ya Like Me Now" began a rivalry with LL Cool J</pre>
Using ORDER BY: http://www.w3schools.com/sql/sql_orderby.asp
=== 2. Random game categories ===
Write a script to randomly choose a game number and print the categories from that game.
<b>tip</b>: the <code>category</code> table has <code>game</code> and <code>round</code> fields. round 0 is the Jeopardy round, round 1 is the Double Jeopardy round, and round 2 is Final Jeopardy.
<pre>Categories for game #136:
0 WELCOME TO MY COUNTRY
0 GO GO GAUGUIN
0 FILE UNDER "M"
0 ANIMATED CATS
0 MAO MAO MAO MAO
1 SHAKESPEARE'S OPENING LINES
1 HEY, MARIO!
1 BRIDGE ON THE RIVER....
1 RUNNING MATES
1 13-LETTER WORDS
1 TONY BENNETT'S SONGBOOK
2 IN THE NEWS 2000</pre>
=== 3. Top 20 Jeopardy categories ===
Read about the <code>GROUP BY</code> clause and write a script using it to print the 20 most common Jeopardy categories.
An example of using <code>GROUP BY</code> and <code>ORDER BY</code> to produce an ordered list of counts on a hypothetical <code>foo</code> field is:
<pre>SELECT foo, COUNT(foo) AS count FROM my_table GROUP BY foo ORDER BY count</pre>
79 BEFORE & AFTER
73 WORD ORIGINS
64 BUSINESS & INDUSTRY
63 AMERICAN HISTORY
learned about SQL and making database queries from within Python. Keep practicing!