Community Data Science Workshops (Fall 2014)/Day 2 lecture: Difference between revisions

no edit summary
imported>Fhocutt
mNo edit summary
imported>Mako
No edit summary
 
(11 intermediate revisions by 4 users not shown)
Line 1:
{{CDSW Moved}}
 
[[File:Highfivekitten.jpeg|200px|thumb|In which you learn how to use Python and web APIs to meet the likes of her!]]
== Lecture Slides ==
 
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.pdf Slides (PDF)] — For viewing
* [http://mako.cc/teaching/2014/cdsw-autumn/lecture2-web_apis.odp Slides (ODP Libreoffice Slides Format)] — For editing and modification
== Resources ==
 
* Encoding:
** [http://nedbatchelder.com/text/unipain.html Pragmatic Unicode]
** [https://docs.python.org/2/howto/unicode.html Official Python Unicode documentation]
 
 
== Lecture Outline ==
;Introduction and context
 
* You can write some tools in Python now. Congratulations!
* Today we'll learn how to find/create data sets
* Next week we'll get into data science (asking and answering questions)
 
 
Outline:
;Outline:
 
* What did we learn in Session 1?
* What is an API?
Line 14 ⟶ 31:
* How do we use APIs in general?
 
 
What is a (web) API?
;What is a (web) API?
 
* API: a structured way for programs to talk to each other (aka an interface for programs)
* Web APIs: like a website your programs can visit (you:a website::your program:a web API)
* Using APIs: your program sends a request, the API sends data back
** Where do you direct your request? The site's API endpoint.
*** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php
** How do I write my request? Put together a URL; it will be different for different web APIs.
*** Check the documentation, look for code samples
** How do you send a request?
*** Python has modules you can use, like <code>requests</code> (they make HTTP requests)
** What do you get back?
*** Structured data (usually in the JSON format)
** How do you understand (i.e. parse) the data?
*** There's a module for that!
 
#* to use APIs to build a dataset we will need:
#** all our tools from last session: variables, etc
#** the ability to open urls on the web
#** the ability to create custom URLS
#** the ability to save to files
#** the ability to understand (i.e., parse) JSON data that APIs usually give us
 
; How do we use an API to fetch datasets?
What did we learn in Session 1?
 
Basic idea: your program sends a request, the API sends data back
* Where do you direct your request? The site's API endpoint.
** For example: Wikipedia's web API endpoint is http://en.wikipedia.org/w/api.php
* How do I write my request? Put together a URL; it will be different for different web APIs.
** Check the documentation, look for code samples
* How do you send a request?
** Python has modules you can use, like <code>requests</code> (they make HTTP requests)
* What do you get back?
** Structured data (usually in the JSON format)
* How do you understand (i.e. parse) the data?
** There's a module for that!
 
 
; How do we write Python programs that make web requests?
 
To use APIs to build a dataset we will need:
* all our tools from last session: variables, etc
* the ability to open urls on the web
* the ability to create custom URLS
* the ability to save to files
* the ability to understand (i.e., parse) JSON data that APIs usually give us
 
 
; Session 1 review
 
* Navigating in the terminal and using it to run programs
* Writing Python:
Line 46 ⟶ 74:
** importing modules, so you can use code other people have written for you!
 
# New programming concepts:
#* requests
#* interpolate variables into a string using % and %()s
#* open files and write to them
# [http://placekitten.com/ placekitten.com]
#* API that takes specially crafted URLs and gives appropriately sized picture of kittens
#* example of placekitten in browser
#** visit the API documentation
#** kittens of different sizes
#** kittens in greyscale or color
#* show how to use place
#* write a small program to grab arbitrary square from placekitten by asking for the size on standard in
 
; New programming concepts:
# JSON file (JavaScript Object Notation)
 
#* what is json: useful for more structure data
* interpolate variables into a string using % and %()s
#* import json; json.loads()
* requests
#* like Python (except no single quotes)
* open files and write to them
#* simple lists, dictionaries
* parsing a string (turning the string into a data structure we can manipulate)
#* can reflect more complicated data structures
 
#* Example file at http://mako.cc/cdsw.json
 
#* download it and parse it
; How do we use an API to fetch kitten pictures?
 
[http://placekitten.com/ placekitten.com]
* API that takes specially crafted URLs and gives appropriately sized picture of kittens
* Exploring placekitten in a browser:
** visit the API documentation
** kittens of different sizes
** kittens in greyscale or color
* Now we write a small program to grab an arbitrary square from placekitten by asking for the size on standard in: [http://mako.cc/teaching/2014/cdsw-autumn/placekitten_raw_input.py placekitten_raw_input.py]
 
 
; Introduction to structured data (JSON, JavaScriptObjectNotation)
 
* what is json: useful for more structured data
* import json; json.loads()
* like Python (except no single quotes)
* simple lists, dictionaries
* can reflect more complicated data structures
* Example file at http://mako.cc/cdsw.json
* download it and parse it: [http://mako.cc/teaching/2014/cdsw-autumn/parse_cdswjson.py parse_cdswjson.py]
 
 
; Using other APIs
 
* every API is different, so read the documentation!
* If the documentation isn't helpful, search online
* for popular APIs, there are python modules that help you make requests and parse json
 
Possible issues:
# Other APIs
#* every API is different, so read the documentation!
* If the documentation isn't helpful, search online!
* for popular APIs, there are python modules that help you make requests and parse json!
* rate limiting
* authentication
* authenticaiton
* text encoding issues
Anonymous user