Anonymous user
Community Data Science Workshops (Fall 2014)/Day 3 lecture: Difference between revisions
Community Data Science Workshops (Fall 2014)/Day 3 lecture (view source)
Revision as of 22:00, 15 March 2015
, 9 years agono edit summary
imported>Mako |
imported>Mako No edit summary |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1:
{{CDSW Moved}}
== Material for the lecture ==
Line 26 ⟶ 28:
* Four things in Python I have to teach you:
** while loops
*** infinite loops
*** loops with a greater than or less than
** break / continue
** string.join()
Line 33 ⟶ 37:
* Load data into Python
** review of opening files
*** we can also open them for reading
** csv module and and csv.reader() function
** csv.DictReader()
Line 38 ⟶ 43:
** Answer question: ''What proportion of edits to Wikipedia Harry Potter articles are minor?''
*** Count the number of minor edits and calculate proportion
* Looking at time series data▼
** "Bin" data by day to generate the trend line▼
* Exporting and visualizing data▼
** Export dataset on edits over time▼
** Export dataset on articles over users▼
** Load data into Google Docs▼
We mostly worked on these questions in the afternoon:
Line 48 ⟶ 57:
** Answer question: ''Who are the most active editors on articles in Harry Potter?''
*** Count the number of edits per user
▲* Looking at time series data
▲** "Bin" data by day to generate the trend line
▲* Exporting and visualizing data
▲** Export dataset on edits over time
▲** Export dataset on articles over users
▲** Load data into Google Docs
|