Anonymous user
Community Data Science Workshops (Fall 2014)/Reflections: Difference between revisions
Community Data Science Workshops (Fall 2014)/Reflections (view source)
Revision as of 22:13, 15 March 2015
, 9 years agomoved to wiki.communitydata.cc
imported>Mako |
imported>Jtmorgan m (moved to wiki.communitydata.cc) |
||
(10 intermediate revisions by 2 users not shown) | |||
Line 1:
{{CDSW Moved}}
:''If you're interested in putting on your own CDSW, you should also see our [[Community Data Science Workshops (Spring 2014)/Reflections|reflections from Spring 2014]].''
Line 41 ⟶ 42:
We had 30 mentors who attended at least one of the sessions and at least 20 mentors at each sessions. Many of our mentors were UW students in more technical departments like [https://www.cs.washington.edu/ Computer Science and Engineering] and [https://www.hcde.washington.edu Human Centered Design & Engineering]. Perhaps half of them worked outside of the university as software developers.
We had about 150 participants apply to attend the sessions. We selected on programming skill (to ensure that all attendees were complete beginners), enthusiasm, and randomly to maintain a learner to mentor ratio of between 4 and 5. We admitted 80 participants. 58 listed a UW
{| class=wikitable
Line 62 ⟶ 63:
We had two people each who listed their affiliations as Bio- and Health Informatics, the Foster School of Management, Microsoft, and Wikipedia.
We also had people from
Retention between session and 0 and 1 was nearly 100%. Retention between sessions 1 and 2 and sessions 2 and 3 was roughly 75% leaving us with perhaps 55-60% retention between session 0 and session 3.
Line 80 ⟶ 81:
[[User:Mako|Benjamin Mako Hill]] gave lectures in Session 1 and 3. Frances Hocutt gave the lecture in Session 2 and we felt that this was was an important step. An important future goal is getting other people to give lectures. Tommy is an obvious choice to do one next time. Different faces, perspective, and backgrounds are useful to communicate the breadth of interest here. [[User:Mako|Mako]] does not want to be the only one giving these lectures.
Our biggest challenge with growing the workshops was with physical space for the lectures. Basically, rooms
We reserved a lecture hall that fit 200 people and filled it with 100 students in alternating rows to make it at least possible to reach each person. This worked reasonably well
People continue to want a record of lectures. At the very minimum, we should make sure that we turn on console logging so that we can post this after the lectures. We intended to record lectures but, once again, this got lost in all the crazy preparation for the events.
Line 99 ⟶ 100:
== Session 0: Python Setup ==
The goal of this session was to get users setup with Python and starting to learn some
* Anaconda is not free software
* Anaconda does not support Python 3 which we'd like to move to.
*
Additionally, we moved the Windows curriculum from away from <code>cmd</code> to using Powershell. This was
Changes for next time include:
* Because it was less
* Because Powershell was successful, we're going to try to create a single consolidated set of installation instructions for Windows, Mac OSX, and GNU/Linux
* We will make it more clear to mentors whether participants should self-report they’d completed the steps or whether the mentor should verify that the steps were all taken (the latter). In future, we will email mentors ahead of time to let them know.
*
* We need to do a better job of
* The sticky notes we bought were small and ambiguous color. We should get
* We are going to try writing additional installation instructions that do not rely on Anaconda so people have a fully open source option.
* Once again, not a single person outside of
* We
▲* Not everybody loves the checkout step. Maybe there's a way we can make it more fun?
We also had [[Community Data Science Workshops (Fall 2014)/Reflections#Mentorship|a bunch of general feedback on how we could improvement mentorship]] that is
== Session 1: Introduction to Python ==▼
The goal of this session was to teach the basic of programming in Python. The basic curriculum was originally built off the [[Boston Python Workshop]] curriculum which has been used many times and is well tested. Unsurprisingly, it worked well for us as well.
▲We also had [[Community Data Science Workshops (Fall 2014)/Reflections#Mentorship|a bunch of general feedback on how we could improvement mentorship]] that is particular relevant to the earlier session
▲== Session 1: Introduction to Python ==
=== Afternoon sessions ===
We felt that that the new [[Baby Names]] project was excellent and feedback was
Suggestions based on feedback include:
* Do a better job of
* Consider simply having two smaller rooms doing [[Baby Names]] and perhaps
* Prepare questions before hand, list them all up front, and let folks choose what to work on.
Line 149 ⟶ 147:
=== Morning lecture ===
The [[Community Data Science Workshops (Fall 2014)/Day 2 lecture|morning lecture]] was given by Frances Hocutt and it
Frances used excellent slides which are shared [[Community Data Science Workshops (Fall 2014)/Day 2 lecture|on the wiki page]] and which we will reuse. About half found
Since many people felt the lecture was on the slower side, we want to use this time to introduce function
=== Afternoon sessions ===
There were three parallel afternoon sessions on '''Twitter''', '''Wikipedia API''' and '''SQL'''.
'''Twitter''':
* Once again, the session had too many people for the room and we should consider splitting it if we have mentors who are comfortable
*
* A bunch of people found the Twitter session too fast
* TweePy continues to be both poorly documented and opaque. The opaqueness of TweePy was a problem and we may want to create an interface to TweePy that just gives users raw JSON.
'''Wikipedia''' workshop:
* In terms of delivery, there was mixed feedback including some excellent feedback and some who felt that it was too detailed and slow. This mirrored some of our feedback from last time. One approach would be to make the Wikipedia room be a designated "slower" room.
*
'''SQL workshop''':
Jonathan ran a session on using SQL. Although this was a diversion from the strong Python focus, it was well attended and appreciated by students trying to build up this skill.
* Generally was very successfuly Seemed to work really well and did a good job of giving people an overview of a data science and a way to hook themselves in to it. ▼
* Can we host an open SQL database somewhere?▼
▲* Generally
* Next session, if we do this again, we should consider integrating Python more closely into this. We may either close the loop in this session or perhaps split into two sessions: (1) introduction to SQL; and (2) using Python to bring data back into Python (e.g., in Pandas).
== Session 3: Data Analysis and Visualization ==▼
The goal of the lecture was to walk people through the actual mess of
In general, goals were clearer this time and the use of Anaconda meant that we could use <code>requests</code> which cleaned up several problems last time and led to more clear code.
One challenge, pointed out in a question at the end of the final lecture, is that we don't actually do very much actual data analysis during the lecture. Next time, we should make this much more clear up front. The reality is that we were doing analysis from the very first day and that where analysis starts and where data cleaning and munging ends can be fluid, fuzzy, and subjective. We should foreground this in the beginning of the lecture or even at the beginning of the workshops.
▲== Session 3: Data Analysis and Visualization ==
▲The goal of the lecture was to walk people through the actual mess of making a code.
=== Afternoon sessions ===
We ran two sessions this time.
An '''
Also, next session, we are going to consider using [https://pypi.python.org/pypi/seaborn/0.1 SeaBorn] instead of matplotlib which Tommy seemed excited about.
== General Feedback ==
* Generally, there was a sense that we should stop creating pages in the
* We should try to schedule the workshop not
* Mentors should post the code generated in the break-
* There was general interest in pair programming or more team based
* There was a need for several on-the-fly corrections of the instructions and files on the wiki during the workshop. Better planning and testing for this will be very useful.
=== Mentorship ===
Last time through, most of our observation were focused on improving the experience of attendees and we think we didn't spend as much time on helping mentors have a great experience and helping them prepare effectively. We had many new mentors this round. One general concern was the relative lack of mentor training, especially before the first sessions. We had a series of pieces of feedback on how to improve this.
* Arrange a pre-CDSW mentors meeting (perhaps a day or two before to over material) and maybe at a bar or other social environment with beer and pizza. We
* Perhaps meet 15-20 minutes early before Session 0 to get to know each other and over things.▼
▲* Arrange a mentors meeting (perhaps a day or two before to over material) and maybe at a bar or other social environment with beer and pizza. We coudl use this tnorms, best practices, goals, planning, etc.
▲* Perhaps meet 15-20 minutes early to get to know each other and over things
* Create some easier way to distinguish mentors from students (e.g., t-shirts, buttons, paper them head to foot in sticky notes).
* Send out
**
** Explicitly encourage mentors to reach out to students and ask them how things are going by walking around to every single person to ask, “How are you doing? What are you working on? Show me what you’re doing.”
=== More Projects or Better Projects ===
Arguments for smaller groups of the best break-out session include:
* Focus on a known good thing.
*
Arguments against include:
Line 247 ⟶ 227:
* Diversity of projects inspires people to do the kinds of things that people can do with this new knowledge.
We should pursue other ways to encourage creativity with code. For
== Budget ==
We spent a total of $3280 on the CDSW. We spent approximately $280 on coffee. About $350 of this funded food and refreshments during post-session meetings among the mentors. About $280 was spent on coffee,
The rest (the large majority) was spent on food. Because were better able to model retention this time around, we did a much better job of ordering the "right" amount of food. We ordered:
* Session 1: Pizza from Jet City Pizza
* Session 2: Indian (four entrees) from Jewel of India
* Session 3: Greek food (e.g., salad, hummus, spinach pies, souvlaki) from Costas
Because [[Mako]] did the ordering, everybody ate vegetarian. At least one person complained about the lack of meat in Session 2 (but seemed to be confused into thinking it was present in Session 1).
<!--
Line 258 ⟶ 249:
-->
<!-- LocalWords: CDSW th nd BPW Unretained wikitable HCDE iSchool
-->
<!-- LocalWords: Informatics Meshnet Anecdotally suboptimal scipy
-->
<!-- LocalWords: statsmodels cmd Powershell XP deemphasize OSX JSON
-->
<!-- LocalWords: Mentorship mentorship PlaceKitten TweePy SeaBorn
-->
<!-- LocalWords: matplotlib
-->
|