Learn how to get data from a database in Python while writing parts of a Jeopardy game, using real Jeopardy data!
- practice using the SQL database query language
- practice getting data from a database in Python
1. Download and un-archive the Jeopardy database project skeleton code
Un-archiving will produce a
JeopardyDatabase folder containing 3 Python files and one SQL database dump.
2. Install the project dependencies
- Download the SQLite command line binary from http://www.sqlite.org/download.html. Download the first link under "Precompiled Binaries For Windows".
On OS X
- You already have SQLite installed and don't have to do anything!
- Install the
sqlite3package through your package manager.
3. Create a SQLite database from the database dump
JeopardyDatabase folder is a file called
jeopardy.dump which contains a SQL database dump. We need to turn that database dump into a SQLite database.
Once you have SQLite installed, you can create a database from jeopardy.dump with (this needs to be run in the command prompt):
sqlite3 jeopardy.db < jeopardy.dump
This creates a sqlite3 database called
4. Test your setup
At a command prompt, start
sqlite3 using the
jeopardy.db database by running:
That should start a sqlite prompt that looks like this:
SQLite version 3.6.12 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite>
At that sqlite prompt, type
.tables and hit enter. That should display a list of the tables in this database:
sqlite> .tables category clue sqlite>
To quit sqlite:
From a command prompt, navigate to the
JeopardyDatabase directory and run
If you run into a syntax error because of "print", go into the file and make sure the printed statement is encased in a ()
You should see a list of 10 jeopardy categories printed to the screen. If you don't, let a staff member know so you can debug this together.
1. Look at the layout of the Jeopardy database
Start sqlite with:
Then look at the tables in your database by running the following commands and the sqlite prompt:
- .table, which will list the tables in the database
- .schema category, which will show the organization of the category table, including the fields and the data types they store.
It should look like this:
sqlite> .schema category CREATE TABLE "category" ( id INTEGER PRIMARY KEY, name VARCHAR(255) NOT NULL, game INTEGER NOT NULL, boardPosition INTEGER );
This tells us that the
category table has 4 fields:
- .schema clue
Read these short documents:
- What is SQL? http://www.w3schools.com/sql/sql_intro.asp
- What is a database schema? http://wiki.answers.com/Q/What_is_a_database_schema
Check your understanding:
- What tables are in the database?
- What is a schema?
- What fields are in the
- What fields are in the
2. Query the database with SELECT
Try running the following queries from the sqlite prompt:
- SELECT * FROM category;
- SELECT NAME FROM category;
- SELECT * FROM clue;
- SELECT text, answer, value FROM clue;
- SELECT text, answer, value FROM clue LIMIT 10;
clue tables with your own SELECT queries.
Check your understanding:
- What does
*mean in the above queries?
- What does the
LIMITSQL keyword do?
- Does case matter when making SQL queries?
Step 2 resources:
- Using SELECT: http://www.w3schools.com/sql/sql_select.asp
3. Make database queries from Python
Examine the code in
jeopardy_categories.py. To make a database query from Python, you need to:
- Import a Python library for making database connections
- Establish a connection to the desired database
- Get a cursor from the database for making queries
- Execute the database query using the standard SQL syntax
- Retrieve the list of results from the database cursor
- Do something useful with the results, like print them
- Close the database connection
Match up each of these steps with lines of code in the file.
4. Tweak the existing Jeopardy scripts
jeopardy_categories.py to print both the category and game number
tip: Remind yourself of the
categories schema by running
.schema category at a sqlite prompt.
Example categories: DETECTIVE FICTION (game #1) THE OLD TESTAMENT (game #2) ASIAN HISTORY (game #4) RIVER SOURCES (game #5) WORLD RELIGION (game #3) SEAN SONG (game #2) ANIMATED MOVIES (game #1) NEW YORK CITY (game #6) AFRICAN WILDLIFE (game #7) LITTLE RED RIDING HOOD (game #8)
jeopardy_clues.py to only print clues with an $800 value.
A good way to achieve this is by adding a
WHERE clause to the SQL query in
WHERE clauses in this short document:
- The WHERE Clause: http://www.w3schools.com/sql/sql_where.asp
Example clues: [$800] A: She also created the detectives Tuppence & Tommy Beresford Q: What is 'Agatha Christie' [$800] A: According to this Old Testament book, this "swords into plowshares" prophet walked naked for 3 years Q: What is 'Isaiah' ...
5. Daily Doubles
Write a script that prints 10 daily doubles and their responses.
clue table has an
tip: Note that the category name is in the category table, and the question and answer are in the clue table.
Category: NEW YORK CITY Question: The heart of Little Italy is this street also found in a Dr. Seuss book title Answer: Mulberry Street === Category: RIVER SOURCES Question: This Mideastern boundary river rises on the slopes of Mount Hermon Answer: the Jordan === Category: ROOM Question: The Titanic has 3 rooms for this--only men were allowed there, as women weren't supposed to do it in public Answer: smoking ...
1. Random category clues
Write a script that randomly chooses a category and prints clues from that category.
tip: SQL supports an "
ORDER BY RANDOM()" clause that will return rows in a random order. For example, to randomly pick 1 category id you could use:
SELECT id FROM category ORDER BY RANDOM() LIMIT 1;
You can also use
ORDER BY to sort the clues by value.
5 GUYS NAMED MOE [$200] Last name of Moe of the Three Stooges [$400] Moe Strauss founded this auto parts chain along with Manny Rosenfield & Jack Jackson [$600] Major league catcher Moe Berg was also a WWII spy for this agency, precursor of the CIA [$800] Term for the type of country music Moe Bandy plays, the clubs where he began, or the "Queen" he sang of in 1981 [$1000] This "Kool" rapper's album "How Ya Like Me Now" began a rivalry with LL Cool J
- Using ORDER BY: http://www.w3schools.com/sql/sql_orderby.asp
2. Random game categories
Write a script to randomly choose a game number and print the categories from that game.
category table has
round fields. round 0 is the Jeopardy round, round 1 is the Double Jeopardy round, and round 2 is Final Jeopardy.
Categories for game #136: 0 WELCOME TO MY COUNTRY 0 METALS 0 GO GO GAUGUIN 0 FILE UNDER "M" 0 ANIMATED CATS 0 MAO MAO MAO MAO 1 SHAKESPEARE'S OPENING LINES 1 HEY, MARIO! 1 BRIDGE ON THE RIVER.... 1 RUNNING MATES 1 13-LETTER WORDS 1 TONY BENNETT'S SONGBOOK 2 IN THE NEWS 2000
3. Top 20 Jeopardy categories
Read about the
GROUP BY clause and write a script using it to print the 20 most common Jeopardy categories.
An example of using
GROUP BY and
ORDER BY to produce an ordered list of counts on a hypothetical
foo field is:
SELECT foo, COUNT(foo) AS count FROM my_table GROUP BY foo ORDER BY count;
81 LITERATURE 79 BEFORE & AFTER 73 WORD ORIGINS 71 SCIENCE 64 BUSINESS & INDUSTRY 63 AMERICAN HISTORY ...
You've learned about SQL and making database queries from within Python. Keep practicing!