How many times does the word 'love' appear in Shakespeare's plays? Is it possible to find negative passages using a list of keywords? We'll use Python to practice our skills and answer questions like these.


We'll need additional setup here.


  • Have fun using Python to learn basic data science.
  • Practice searching for information in text documents
  • Practice manipulating strings
  • Practice using loops
  • Practice using lists
  • Practice using dictionaries.
  • Get experience with regular expressions.

Skills & Exercises


  • Checking if two strings are equal
>>> s = "mama"
>>> s == "mama"
>>> s == "papa"
  • Checking if a string contains another as a substring
>>> s = "mama"
>>> s in "I love mama"
>>> "day" in "Saturday"
>>> "Day" in "Saturday"
>>> Sat = "Saturday"
>>> "day" in Sat

File Operations

  • Open a file
>>> M_S=open("A Midsummer-Night's Dream.txt", "r")
>>> M_S
<open file "A Midsummer-Night's Dream.txt", mode 'r' at 0x102d04660>
  • Read a line
>>> M_S = open("A Midsummer-Night's Dream.txt", "r")
>>> line = M_S.readline()
>>> print line
< Shakespeare -- A MIDSUMMER-NIGHT'S DREAM >
  • Read a file line by line until the end of file (also known as <cod> eof )
>>> for eachline in M_S:
>>> # Do something here with each line read

File & Strings Exercise

  • We will use the play Romeo and Juliet

• We will use the play Romeo and Juliet

1. Open the file "Romeo and Juliet.txt" (This file should be saved in the same directory as your python program to be able to use local file name)