Difference between revisions of "Community Data Science Workshops (Spring 2014)/Saturday May 3rd lecture"

From OpenHatch wiki
Jump to navigation Jump to search
imported>Mako
imported>Jtmorgan
m (moved to wiki.communitydata.cc)
 
(6 intermediate revisions by one other user not shown)
Line 1: Line 1:
 +
{{CDSW_Moved}}
 
== Lecture Outline ==
 
== Lecture Outline ==
  
Line 14: Line 15:
 
#** the ability to understand (i.e., parse) JSON data that APIs usually give us
 
#** the ability to understand (i.e., parse) JSON data that APIs usually give us
 
# Review material from last session
 
# Review material from last session
#* variables, different types
+
#* variables
#* printing
 
#* if statements
 
 
#* lists
 
#* lists
 
#* dictionaries
 
#* dictionaries
 +
#* if statements
 
#* for loops
 
#* for loops
 +
#* printing
 
#* modules
 
#* modules
#* example python program
 
 
# New programming concepts:
 
# New programming concepts:
 
#* urllib2 and urlopen
 
#* urllib2 and urlopen
 
#* interpolate variables into a string using % and %()s
 
#* interpolate variables into a string using % and %()s
 +
#* open files and write to them
 
# [http://placekitten.com/ placekitten.com]
 
# [http://placekitten.com/ placekitten.com]
 
#* API that takes specially crafted URLs and gives appropriately sized picture of kittens
 
#* API that takes specially crafted URLs and gives appropriately sized picture of kittens
Line 41: Line 42:
 
#* Example file at http://mako.cc/cdsw.json
 
#* Example file at http://mako.cc/cdsw.json
 
#* download it and parse it
 
#* download it and parse it
# Wikipedia API
 
#* explain MediaWiki, exists on other wikis
 
#* navigate to [http://en.wikipedia.org/w/api.php api page] and show the documentation, point out examples
 
#* looking at the images within a page http://en.wikipedia.org/w/api.php?action=query&titles=Seattle&prop=images&imlimit=20&format=jsonfm
 
#* looking at within two pages http://en.wikipedia.org/w/api.php?action=query&titles=Seattle|Bellevue,_Washington&prop=images&imlimit=50&format=jsonfm
 
#* edit count http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Benjamin_Mako_Hill|Jtmorgan|Sj|Mindspillage&usprop=editcount&format=jsonfm
 
#* give me the content of the main page http://en.wikipedia.org/w/api.php?format=json&action=query&titles=Main%20Page&prop=revisions&rvprop=content
 
 
# Other APIs
 
# Other APIs
 
#* every API is different, so read the documentation!
 
#* every API is different, so read the documentation!
 +
#* for popular APIs, there are python modules that help you make requests and parse json!
 
#* rate limiting
 
#* rate limiting
 
#* authenticaiton
 
#* authenticaiton
 
#* text encoding issues
 
#* text encoding issues

Latest revision as of 22:04, 15 March 2015

Page Moved
All material related to the Community Data Science Workshops have been moved from the OpenHatch wiki to a new dedicated wiki and this page is no longer being updated here. Please visit the new version of the page on the Community Data Science Collective wiki.

Lecture Outline[edit]

  1. Introduction to APIs
    • definition of API: just an interface for programs
    • definition of web API
      • way to ask for data (almost always a URL)
      • way to get data back (almost always in a format called JSON)
      • every API is different, and documented
    • to use APIs to build a dataset we will need:
      • all our tools from last session: variables, etc
      • the ability to open urls on the web
      • the ability to create custom URLS
      • the ability to save to files
      • the ability to understand (i.e., parse) JSON data that APIs usually give us
  2. Review material from last session
    • variables
    • lists
    • dictionaries
    • if statements
    • for loops
    • printing
    • modules
  3. New programming concepts:
    • urllib2 and urlopen
    • interpolate variables into a string using % and %()s
    • open files and write to them
  4. placekitten.com
    • API that takes specially crafted URLs and gives appropriately sized picture of kittens
    • example of placekitten in browser
      • visit the API documentation
      • kittens of different sizes
      • kittens in greyscale or color
    • show how to use place
    • write a small program to grab arbitrary square from placekitten by asking for the size on standard in
  5. JSON file (JavaScript Object Notation)
    • what is json: useful for more structure data
    • import json; json.loads()
    • like Python (except no single quotes)
    • simple lists, dictionaries
    • can reflect more complicated data structures
    • Example file at http://mako.cc/cdsw.json
    • download it and parse it
  6. Other APIs
    • every API is different, so read the documentation!
    • for popular APIs, there are python modules that help you make requests and parse json!
    • rate limiting
    • authenticaiton
    • text encoding issues