Community Data Science Workshops (Spring 2014)/Reflections: Difference between revisions

Content added Content deleted
imported>Mako
imported>Mako
Line 116: Line 116:
== Session 2: Learning APIs ==
== Session 2: Learning APIs ==


Mentors and students felt that this session was the most successful and effective session — including, surprisingly, the most widely tested BPW session.
Mentors and students felt that this session was the most successful and effective session.


=== Morning Lecture ===
=== Morning lecture ===


The morning lecture was well received — if delivered too quickly by Benjamin Mako Hill. Unsurprisingly, the example of PlaceKitten as an PI was an enormous hit.
The morning lecture was well received — if delivered too quickly. Unsurprisingly, the example of [http://placekitten.com/ PlaceKitten] as an API was an enormous hit: informative ''and'' cute.


Generally, speaking, explaining what APIs are is difficult. In particular, it's useful to explicitly say that we are focused on web APIs and that APIs are protocols or languages. Learners frequently wanted to ask questions like, "Where in the program is the API?" The API, of course, is the protocol that describes what a client can ask for and what they can expect to receive back. Preparing a concise answer to this question ahead of time is worthwhile.
Defining APIs was difficult. First, general ambiguity around the use of the term and the difference between APIs in general and web APIs should be foregrounded. Learners frequently wanted to ask questions like, "Where in this Python program is the API?" It was difficult for some to grasp that the API is the ''protocol'' that describes what a client can ask for and what they can expect to receive back. Preparing a concise answer to this question ahead of time would have been worthwhile. We spent too much time on this in the session.


Although there was some debate among the mentors, if there is one thing we might remove from curriculum for a future session, it might be JSON. The reason it seemed less useful is that most of the APIs that most learners plan to use (e.g., Twitter) already have Python interfaces in the form of modules. In this sense, spend 1/4 of a lecture to learn how to parse JSON objects seems like a poor use of time. On the other hand, spending time looking at JSON objects provides practicing think about more complex data structures (e.g., nested lists and dictionaries) which is something that ''is'' necessary and that students will otherwise not be prepared for.
Although there was some debate among the mentors, if there is one thing we might remove from curriculum for a future session, it would probably be JSON. The reason it seemed less useful is the APIs that most learners plan to use (e.g., Twitter and Wikipedia) already have Python interfaces in the form of modules. In this sense, spending 30 minutes of a lecture to learn how to parse JSON objects seems like a poor use of time.


On the other hand, time spent looking at JSON objects provides practicing think about more complex data structures (e.g., nested lists and dictionaries) which is something that is necessary and that students will otherwise not be prepared for. We were undecided as a group.
=== Afternoon Sessions ===


=== Afternoon sessions ===
In our session, more than 2/3 students were interested in learning Twitter and the session was heavily attended.


In our session, more than 60% of students were interested in learning Twitter and that track was heavily attended.
In Twitter, discoverability on the tweepy objects was a challenge. Users will have an object but you it's not easy to introspect those objects and see what's there in the same way you can with a JSON object. This came a surprise to us and required some real-time consultation with the TweePy documentation.


In Twitter, discoverability of the structure of [http://www.tweepy.org/ Tweepy] objects was a challenge. Users would create an object but you it was not easy to introspect those objects and see what is there in the way we had discussed with JSON objects. This came a surprise to us and required some real-time consultation with the [http://tweepy.readthedocs.org/en/v2.3.0/ Tweepy module documentation].
The Wikipedia session ended up spending very little time working with the example code we had prepared at all. Instead, we worked directly from examples in the morning and wrote code almost from Scratch while looking directly at the API.


The Wikipedia session ended up spending very little time working with the example code we had prepared. Instead, we worked directly from examples in the morning and wrote code almost entire from scratch while looking directly at the output from the API.
Our session focused on building a version of the game Catfishing. Essentially, we set out to write a program that would get a list of categories for a set of articles, randomly select an articles, and then show categories back to the user to have them "guess" the article. We modified the program to not include obvious giveaways (e.g., to remove categories that include the answer itself as a substring).


Our session focused on building a version of the [http://kevan.org/catfishing.php game Catfishing]. Essentially, we set out to write a program that would get a list of categories for a set of articles, randomly select one of those articlse, and then show categories associated with that article back to the user to have them "guess" the article. We modified the program to not include obvious giveaways (e.g., to remove categories that include the answer itself as a substring).
Both sessions worked well and received good feedback.


Both sessions worked well and received positive feedback.
In future session, we might like to focus on other APIs including, perhaps, APIs that do not include modules which provide a stronger non-pedagogical reason to focus on reading and learning JSON.


Simple APIs might have been a good example of something we could do as a small group exercise between parts of the lecture.
In future session, we might like to focus on other APIs including, perhaps, APIs that do not include modules. This would provide a stronger non-pedagogical reason to focus on reading and learning JSON. Working with simple APIs might have been a good example of something we could do as a small group exercise between parts of the lecture.


== Session 3: Data Analysis and Visualization ==
== Session 3: Data Analysis and Visualization ==