Community Data Science Workshops (Fall 2014)/Reflections: Difference between revisions

no edit summary
imported>Mako
m (formatting fix)
imported>Mako
No edit summary
Line 1:
:''AlsoIf you're interested in putting on your own CDSW, you should also see our [[Community Data Science Workshops (Spring 2014)/Reflections|reflections from Spring 2014]].''
 
Over three weekends in Fall 2014, a group of volunteers organized the [[Community Data Science Workshops (Fall 2014)]] (CDSW)the latest thein first[[CDSW|a series of four sessions workshops]] designed to introduce some of the basic tools of programming and analysis of data from online communities to absolute beginners. This version of theThe [[CDSW (Fall 2014)|Fall 2014 events]] were held between November 7th and 22nd in 2014 at the University of Washington in Seattle.
 
This page hosts reflections on organization and curriculum and is written for anybody interested in organizing their own CDSW — including the authors!
 
In general, the mentors and students suggested that the workshops were a huge success. Students suggested that learned an enormous amount and benefited enormously. Mentors were also generally very excited about running similar projects in the future. That said, we all felt there were many ways to improve on the sessions which arewe have detailed below.
 
If you have any questions or issues, you can contact [[Benjamin Mako Hill]] directly or can email the whole group of mentors at cdsw-au2014-mentors@uw.edu.
Line 18:
* '''Session 3 (Saturday November 22nd)''': [[Community Data Science Workshops (Fall 2014)# Session 3|Data analysis and visualization]]
 
Our organization and the curriculum for Sessions 0 and 1 were originally borrowed from the [http://bostonpythonworkshop.com/ Boston Python Workshop] (BPW) although the particular curriculum has diverged quite a bit at this point as we've improved it and tailored it to the learning goals in our sessions. Session 0 was a three hour evening session to install software. The other sessions were all day-long session (10am to 4pm) sessions broken up into the following schedule:
 
Session 0 was a three hour evening session to install software. All three of the other sessions were all day-long session (10am to 4pm) sessions broken up into the following schedule:
 
* '''Morning, 10am-12:20''': A 2 hour lecture
Line 31 ⟶ 33:
* [https://docs.google.com/forms/d/1RJTTwXe2O_C1ZAtMgWRLGXVc-tRpY76NbvorLg644MQ/viewform After Session 2]
* [https://docs.google.com/forms/d/1-BngUwkEmephM2xLl3Ews2LnopF3sI7hlgYhQK4YJL4/viewform After Session 3]
* [https://docs.google.com/forms/d/1v2gNpPSY3gjJ9G_PZgmjt2YZTBz6XxA6-lLUzDKWfMg/viewform After Session 3 (Unretained)] — Unsurprisingly, perhaps, not a single person filled this out so we likely will not bother with this in the future.
 
We used this feedback to both evaluate what worked well and what did not and to get a sense of what students wanted to learn in the next session and which afternoon sessions they might find interesting.
 
== ApplicantsParticipants ==
 
We had 30 mentors who attended at least one of the sessions and at least 20 mentors at each sessions. Many of theour sessionsmentors were UW students in more technical departments like Computer Science and Engineering and Human Centered Design and Engineering. Most were full time programmers.
 
We had about 150 participants apply to attend the sessions. We selected on programming skill (to ensure that all attendees were complete beginners), enthusiasm, and randomly to maintain a learner to mentor ratio of between 4 and 5. We admitted 80 participants.
 
Retention between session and 0 and 1 was nearly 100%. Retention between sessions 1 and 2 and sessions 2 and 3 was probablyroughly 75% and similar learningleaving us with perhaps 55-60% retention at the end of session 3. Anecdotally, there is a sense that those who are dropping are those who had more trouble but didn’t struggle visibly.
 
Anecdotally, there is a sense that those who are dropping are those who had more trouble but didn’t struggle visibly.
Although our participant pool in [[CDSW (Spring 2014)]] was overwhelming female, there was close to gender balance in both students and mentors. Roughly 2/3 of mentees were from UW and this included students from random places including someone who works for the city of Seattle. Many random Wikipedians were there. It's cool that people who are not doing research but are part of online communities were in the mix with the researchers.
 
Although our participant pool in [[CDSW (Spring 2014)]] was overwhelming female, there was close to gender balance in both students and mentors. Roughly 2/3 of mentees were from UW and this included students from random places including someone who works for the city of Seattle. Many random Wikipedians were there. ItWe continue to think that it's cool that people who are not doing research but are part of online communities were in the mix with the researchers.
We had 16 students from HCDE were there, but also a bunch of mentors. They were good mentors.
 
SeveralOnce again, quite a large number of people applied who arewere already goodskilled at programmingprogrammers. We're still not exactly sure why these people are applying because we think that the fact that the workshops are for absolute beginners is very clear. MaybePerhaps people theyjust want more exposure to data science?
 
Once again, the constraint on scaling the workshop is the number of mentors. Every mentor means that the workshop can accommodate four more mentees.
Line 55 ⟶ 57:
== Morning Lectures ==
 
[[User:Mako|Mako]] gave lectures in Session 1 and 3. Frances Hocutt gave onelecture of the lectures2 and, generally, this was seen as aan hugeimportant success. An important future goal is getting other people to more of the lectures. Tommy is an obvious choice to takedo overone thisnext time time. Different faces, perspective, and backgrounds are useful to communicate the breadth of interest here. [[User:Mako|Mako]] does not want to be the only one giving these lectures.
 
Our biggest challenge with growigngrowing the workshops was with physiucalphysical space for the lectures. Basically, rooms tha can hold more than 100 people at UW are almost exclusively lectures halls that make it almost impossible for mentors to reach students.
 
This time, weWe reserved a lecture hall that sayfit 200 people and filled it with 100 students in alternating rows to make it at least possible to reach each person. Projects are done in breakout sessions which can be split.
 
People continue to want a record of lectures. At the very minimum, we should make sure that we turn on console logging so that we can post this after the lectures.
- turn on loggin gin the concsol and post it after the lecture
 
 
 
== Session 0: Python Setup ==
 
The goal of this session was to get users setup with Python and starting to learn some of the basics. We changed the curriculum enormously to use Continuum's Anaconda instead of Python directly from [http://python.org python.org]. The result was staggering. Not a ''single person'' reported "many problems with set-up" (i.e., respondants reported either "no problems" or a "few problems.")
 
Anaconda was key to smoothyness compared to the first workshop series and addressed most of our setup and path issues. That said, we had several major concerns:
 
* Anaconda is not free software or open source
* Anaconda does not support Python 3 which we'd like to move to
* One studdent had a home directory in Chinese which caused the Anaconda installation to fail at a very late stage. This was eventually fixed by a mentor who changed the path.
 
Additionally, we moved the Windows curriculum from away from <code>cmd</code> to using Powershell. This was a huge benefit because it meant that <code>ls</code> works and the rest of the curriculum can converge. The only concerns were:
 
* Powershell is not installed on Windows XP although ''not a single student had Windows XP''
 
Changes for next time include:
 
* Because it was less successful, we can deemphasize recruiting mentors to the Friday night session.
* Because Powershell was successful, we're going to try to create a single consolidated set of installation instructions for Windows, Mac OSX, and Linux!
* We will make it clear to mentors whether participants should self-report they’d completed the steps or whether the mentor should verify that the steps were all taken. In future, email mentors ahead of time to let them know.
* We need to do a better job of modelling stticky notes during lectures early on.
* The sticky notes we bought were small and ambiguous color. We should get bright red sticky notes next time.
* Set up/arrange/select the space to facilitate better circulation of mentors.
When mentors can circulate easily things are better for mentees.
* We are going to try writing installation instructions that do not rely on Anaconda so people have a fully open source option.
* Once again, not a single person outside of mentors ran GNU/Linux. We should strongly consider how much effort we want to put into maintaining this part of the curriculum.
* We should move to Python 3 to try to address lingering unicode issues. We should try to do this for the next session.
* Not everybody loves the checkout step. Maybe there's a way we can make it more fun?
 
 
 
We also had [[Community Data Science Workshops (Fall 2014)/Reflections#Mentorship|a bunch of general feedback on how we could improvement mentorship]] that is particular relevant to the earlier session
 
== Afternoon Sessions ==
Line 136 ⟶ 104:
talk to chris to try to fix those things
 
 
 
== Session 0: Python Setup ==
 
The goal of this session was to get users setup with Python and starting to learn some of the basics. We changed the curriculum enormously to use Continuum's Anaconda instead of Python directly from [http://python.org python.org]. The result was staggering. Not a ''single person'' reported "many problems with set-up" (i.e., respondants reported either "no problems" or a "few problems.")
 
Anaconda was key to smoothyness compared to the first workshop series and addressed most of our setup and path issues. That said, we had several major concerns:
 
* Anaconda is not free software or open source
* Anaconda does not support Python 3 which we'd like to move to
* One studdent had a home directory in Chinese which caused the Anaconda installation to fail at a very late stage. This was eventually fixed by a mentor who changed the path.
 
Additionally, we moved the Windows curriculum from away from <code>cmd</code> to using Powershell. This was a huge benefit because it meant that <code>ls</code> works and the rest of the curriculum can converge. The only concerns were:
 
* Powershell is not installed on Windows XP although ''not a single student had Windows XP''
 
Changes for next time include:
 
* Because it was less successful, we can deemphasize recruiting mentors to the Friday night session.
* Because Powershell was successful, we're going to try to create a single consolidated set of installation instructions for Windows, Mac OSX, and Linux!
* We will make it clear to mentors whether participants should self-report they’d completed the steps or whether the mentor should verify that the steps were all taken. In future, email mentors ahead of time to let them know.
* We need to do a better job of modelling stticky notes during lectures early on.
* The sticky notes we bought were small and ambiguous color. We should get bright red sticky notes next time.
* Set up/arrange/select the space to facilitate better circulation of mentors.
When mentors can circulate easily things are better for mentees.
* We are going to try writing installation instructions that do not rely on Anaconda so people have a fully open source option.
* Once again, not a single person outside of mentors ran GNU/Linux. We should strongly consider how much effort we want to put into maintaining this part of the curriculum.
* We should move to Python 3 to try to address lingering unicode issues. We should try to do this for the next session.
* Not everybody loves the checkout step. Maybe there's a way we can make it more fun?
 
 
 
We also had [[Community Data Science Workshops (Fall 2014)/Reflections#Mentorship|a bunch of general feedback on how we could improvement mentorship]] that is particular relevant to the earlier session
 
== Session 1: Introduction to Python ==
Anonymous user