Community Data Science Workshops (Fall 2014)/Day 2 lecture: Difference between revisions

From OpenHatch wiki
Content added Content deleted
imported>Fhocutt
(Created page with "== Lecture Outline == # Introduction to APIs #* definition of API: just an interface for programs #* #* definition of web API #** like a website for your #** way to ask for...")
 
imported>Fhocutt
(draft lecture outline)
Line 1: Line 1:
== Lecture Outline ==
== Lecture Outline ==
Introduction and context
* You know how to use Python now. Congratulations!
* Today we'll learn how to find/create data sets
* Next week we'll get into data science (asking and answering questions)

Outline:
* What did we learn in Session 1?
* What is an API?
* How do we use one to fetch interesting datasets?
* How do we write programs that use the internet?
* How do we use the placekitten API to fetch kitten pictures?
* Introduction to structured data (JSON)
* How do we use APIs in general?

What is a (web) API?
* API: a structured way for programs to talk to each other (aka an interface for programs)
* Web APIs: like a website your programs can visit (you:a website::your program:a web API)
* You send a request to the API, it gives you data back
** Where do you direct my request? The site's API endpoint.
*** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php
** How do I write my request? Put together a URL; it will be different for different web APIs.
*** Check the documentation, look for code samples
** How do you send a request?
*** Python has modules you can use, like <code>requests</code> (they make HTTP requests)
** What do you get back?
*** Structured data (usually in the JSON format)
** Congratulations, you have data!


# Introduction to APIs
#* definition of API: just an interface for programs
#*
#* definition of web API
#** like a website for your
#** way to ask for data (almost always a URL)
#** way to ask for data (almost always a URL)
#** way to get data back (almost always in a format called JSON)
#** way to get data back (almost always in a format called JSON)
Line 15: Line 37:
#** the ability to save to files
#** the ability to save to files
#** the ability to understand (i.e., parse) JSON data that APIs usually give us
#** the ability to understand (i.e., parse) JSON data that APIs usually give us

# Review material from last session
What did we learn in Session 1?
#* variables
* Navigating in the terminal and using it to run programs
#* lists
* Writing Python:
#* dictionaries
* variables
#* if statements
* lists
* dictionaries
* if statements
#* for loops
#* for loops
#* printing
#* printing
#* modules
#* modules

# New programming concepts:
# New programming concepts:
#* requests
#* urllib2 and urlopen
#* interpolate variables into a string using % and %()s
#* interpolate variables into a string using % and %()s
#* open files and write to them
#* open files and write to them
Line 35: Line 61:
#* show how to use place
#* show how to use place
#* write a small program to grab arbitrary square from placekitten by asking for the size on standard in
#* write a small program to grab arbitrary square from placekitten by asking for the size on standard in

# JSON file (JavaScript Object Notation)
# JSON file (JavaScript Object Notation)
#* what is json: useful for more structure data
#* what is json: useful for more structure data
Line 43: Line 70:
#* Example file at http://mako.cc/cdsw.json
#* Example file at http://mako.cc/cdsw.json
#* download it and parse it
#* download it and parse it

# Other APIs
# Other APIs
#* every API is different, so read the documentation!
#* every API is different, so read the documentation!
* If the documentation isn't helpful, search online!
#* for popular APIs, there are python modules that help you make requests and parse json!
* for popular APIs, there are python modules that help you make requests and parse json!
#* rate limiting
* rate limiting
#* authenticaiton
* authenticaiton
#* text encoding issues
* text encoding issues

Revision as of 04:38, 10 November 2014

Lecture Outline

Introduction and context

  • You know how to use Python now. Congratulations!
  • Today we'll learn how to find/create data sets
  • Next week we'll get into data science (asking and answering questions)

Outline:

  • What did we learn in Session 1?
  • What is an API?
  • How do we use one to fetch interesting datasets?
  • How do we write programs that use the internet?
  • How do we use the placekitten API to fetch kitten pictures?
  • Introduction to structured data (JSON)
  • How do we use APIs in general?

What is a (web) API?

  • API: a structured way for programs to talk to each other (aka an interface for programs)
  • Web APIs: like a website your programs can visit (you:a website::your program:a web API)
  • You send a request to the API, it gives you data back
    • Where do you direct my request? The site's API endpoint.
    • How do I write my request? Put together a URL; it will be different for different web APIs.
      • Check the documentation, look for code samples
    • How do you send a request?
      • Python has modules you can use, like requests (they make HTTP requests)
    • What do you get back?
      • Structured data (usually in the JSON format)
    • Congratulations, you have data!
      • way to ask for data (almost always a URL)
      • way to get data back (almost always in a format called JSON)
      • every API is different, and documented
    • to use APIs to build a dataset we will need:
      • all our tools from last session: variables, etc
      • the ability to open urls on the web
      • the ability to create custom URLS
      • the ability to save to files
      • the ability to understand (i.e., parse) JSON data that APIs usually give us

What did we learn in Session 1?

  • Navigating in the terminal and using it to run programs
  • Writing Python:
  • variables
  • lists
  • dictionaries
  • if statements
    • for loops
    • printing
    • modules
  1. New programming concepts:
    • requests
    • interpolate variables into a string using % and %()s
    • open files and write to them
  2. placekitten.com
    • API that takes specially crafted URLs and gives appropriately sized picture of kittens
    • example of placekitten in browser
      • visit the API documentation
      • kittens of different sizes
      • kittens in greyscale or color
    • show how to use place
    • write a small program to grab arbitrary square from placekitten by asking for the size on standard in
  1. JSON file (JavaScript Object Notation)
    • what is json: useful for more structure data
    • import json; json.loads()
    • like Python (except no single quotes)
    • simple lists, dictionaries
    • can reflect more complicated data structures
    • Example file at http://mako.cc/cdsw.json
    • download it and parse it
  1. Other APIs
    • every API is different, so read the documentation!
  • If the documentation isn't helpful, search online!
  • for popular APIs, there are python modules that help you make requests and parse json!
  • rate limiting
  • authenticaiton
  • text encoding issues