Community Data Science Workshops (Spring 2014)/Saturday May 3rd lecture: Difference between revisions

From OpenHatch wiki
Content added Content deleted
imported>Mako
imported>Jtmorgan
m (moved to wiki.communitydata.cc)
 
(6 intermediate revisions by one other user not shown)
Line 1: Line 1:
{{CDSW_Moved}}
== Lecture Outline ==
== Lecture Outline ==


Line 14: Line 15:
#** the ability to understand (i.e., parse) JSON data that APIs usually give us
#** the ability to understand (i.e., parse) JSON data that APIs usually give us
# Review material from last session
# Review material from last session
#* variables, different types
#* variables
#* printing
#* if statements
#* lists
#* lists
#* dictionaries
#* dictionaries
#* if statements
#* for loops
#* for loops
#* printing
#* modules
#* modules
#* example python program
# New programming concepts:
# New programming concepts:
#* urllib2 and urlopen
#* urllib2 and urlopen
#* interpolate variables into a string using % and %()s
#* interpolate variables into a string using % and %()s
#* open files and write to them
# [http://placekitten.com/ placekitten.com]
# [http://placekitten.com/ placekitten.com]
#* API that takes specially crafted URLs and gives appropriately sized picture of kittens
#* API that takes specially crafted URLs and gives appropriately sized picture of kittens
Line 41: Line 42:
#* Example file at http://mako.cc/cdsw.json
#* Example file at http://mako.cc/cdsw.json
#* download it and parse it
#* download it and parse it
# Wikipedia API
#* explain MediaWiki, exists on other wikis
#* navigate to [http://en.wikipedia.org/w/api.php api page] and show the documentation, point out examples
#* looking at the images within a page http://en.wikipedia.org/w/api.php?action=query&titles=Seattle&prop=images&imlimit=20&format=jsonfm
#* looking at within two pages http://en.wikipedia.org/w/api.php?action=query&titles=Seattle|Bellevue,_Washington&prop=images&imlimit=50&format=jsonfm
#* edit count http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Benjamin_Mako_Hill|Jtmorgan|Sj|Mindspillage&usprop=editcount&format=jsonfm
#* give me the content of the main page http://en.wikipedia.org/w/api.php?format=json&action=query&titles=Main%20Page&prop=revisions&rvprop=content
# Other APIs
# Other APIs
#* every API is different, so read the documentation!
#* every API is different, so read the documentation!
#* for popular APIs, there are python modules that help you make requests and parse json!
#* rate limiting
#* rate limiting
#* authenticaiton
#* authenticaiton

Latest revision as of 22:04, 15 March 2015

Page Moved
All material related to the Community Data Science Workshops have been moved from the OpenHatch wiki to a new dedicated wiki and this page is no longer being updated here. Please visit the new version of the page on the Community Data Science Collective wiki.

Lecture Outline

  1. Introduction to APIs
    • definition of API: just an interface for programs
    • definition of web API
      • way to ask for data (almost always a URL)
      • way to get data back (almost always in a format called JSON)
      • every API is different, and documented
    • to use APIs to build a dataset we will need:
      • all our tools from last session: variables, etc
      • the ability to open urls on the web
      • the ability to create custom URLS
      • the ability to save to files
      • the ability to understand (i.e., parse) JSON data that APIs usually give us
  2. Review material from last session
    • variables
    • lists
    • dictionaries
    • if statements
    • for loops
    • printing
    • modules
  3. New programming concepts:
    • urllib2 and urlopen
    • interpolate variables into a string using % and %()s
    • open files and write to them
  4. placekitten.com
    • API that takes specially crafted URLs and gives appropriately sized picture of kittens
    • example of placekitten in browser
      • visit the API documentation
      • kittens of different sizes
      • kittens in greyscale or color
    • show how to use place
    • write a small program to grab arbitrary square from placekitten by asking for the size on standard in
  5. JSON file (JavaScript Object Notation)
    • what is json: useful for more structure data
    • import json; json.loads()
    • like Python (except no single quotes)
    • simple lists, dictionaries
    • can reflect more complicated data structures
    • Example file at http://mako.cc/cdsw.json
    • download it and parse it
  6. Other APIs
    • every API is different, so read the documentation!
    • for popular APIs, there are python modules that help you make requests and parse json!
    • rate limiting
    • authenticaiton
    • text encoding issues