Python Workshops for Beginners/Reflections

From OpenHatch wiki

Over three weekends in Fall 2014, a group of volunteers organized the Python Workshops for Beginners (PWFB) — a series of four sessions designed to introduce some of the basic tools of programming and analysis of data from online communities to absolute beginners. The PWFB were held between September 26th and November 15th in 2014 at the University of Waterloo in Waterloo, ON, Canada.

This page hosts reflections on organization and curriculum and is written for anybody interested in organizing their own PWFB — including the authors!

In general, the mentors and students, suggested that the workshops were a huge success. Students suggested that learned an enormous amount and benefited enormously. Mentors were also generally very excited about running similar projects in the future. That said, we all felt there were many ways to improve on the sessions which are detailed below.

If you have any questions or issues, you can contact Elana Hashman.

Structure

The PWFB consisted of four sessions:

The evening session ran from 6 to 9PM, and involved self-guided completion of setup and introductory exercises. The rest of the sessions followed this approximate structure:

  • Morning, 10:30 AM - 12:00 PM: Lecture with one or more breaks.
  • Lunch, 12:00 PM - 1:00 PM: Lunch is served.
  • Afternoon, 1:00 PM - 1:15 PM: Afternoon sessions are introduced.
  • Afternoon, 1:15 PM - 3:30 PM: Afternoon sessions with short projects.
  • Wrap-up, 3:30 PM - 4:00 PM: Closing remarks, next steps, and homework.

Session 2 also featured a review session prior to the morning lecture.

We collected detailed feedback from users at five points using the following Google forms (these are copies):

We used this feedback to both evaluate what worked well and what did not. The final follow-up survey was intended to evaluate how effective the workshops were; we received 32 responses out of our group of 50 participants. We learned that over 60% of participants said they felt less intimidated by programming after completing the workshops, 75% found the workshops "Enjoyable" or "Very Enjoyable", 85% said their interest levels in programming increased, and 90% rated the workshops overall positively. Only four respondents (13%) did not use their new programming skills in any capacity after the workshops. 77% said they would be interested in attending intermediate workshops in the future.

We had ~10 mentors volunteer per session, with a total group of 22 volunteers. Mentors received very positive comments and feedback, with almost half of survey respondents labeling them "Excellent"; only one respondent rated them "Fair", with the rest of responses falling under "Good" (18%) and "Very Good" (29%).

Sessions 0 and 1 had full attendance, but we lost about half our students for Session 2, which was held four weeks later (initially planned to be a week earlier but there was a room booking conflict). We attribute this retention to poor timing (the heart of midterm season) and to the long space between the sessions. Session 3 retained the same number of students that attended Session 2.

Demographics

We had about 230 participants apply to attend the sessions. About 100 of those were immediately filtered out for eligibility: no math or engineering undergrads were permitted to attend the workshops, as their programs have significant required programming components (often 2-3 classes in far more depth than our curriculum covered). After program filtering was completed, the initial round of application filtering was completely anonymized. We selected based on programming skill (to ensure that all attendees were complete beginners), enthusiasm, and overall application quality, and I capped the total around 50 participants given our budget. To break ties for the last few remaining positions, gender was considered. I ended up accepting a total of 76 applicants, with the assumption that many number of those would decline or not show up, which was the case. We had exactly 50 students attend Sessions 0/1.

We promised the PSF we'd collect diversity information in aggregate on our accepted attendees. In aggregate, we found that

  • 52/76 (68%) were women, and the other 32% men. We used a free-entry text form to collect this information.
  • 38/66 (58%) applicants identified as a visible minority, and 28/66 (42%) did not. (10 applicants declined to respond.)
  • 8/68 (12%) applicants had a disability, and 60/68 (88%) did not. (8 applicants declined to respond.)

Morning Lectures

The CDSW in Seattle began each full day with a 120-minute lecture with no breaks. This was a little too intense for the students, so I decided to reduce the length to 1.5h and break things up with short, self-directed exercises. These went very well; students loved how the exercises reinforced the lecture content. I'm not as experienced of a lecturer as Mako, so rather than lecturing freeform, I also chose to use slides and distribute them to students, who told me it made it easier to follow along.

In the Session 3 survey, 35% of respondents said the lectures were "Good", 35% called them "Very Good" and 18% called them "Excellent". 94% of students rated the instructor positively (12% "Good", 47% "Very Good", 35% "Excellent") and the curriculum positively (35% "Good", 41% "Very Good", 18% "Excellent").

Projects

In the afternoons, we broken into small groups to work on projects. In each afternoon we tried to have three afternoon project tracks: Two projects on different substantive topics for learners with different interests and a third project which was self-directed study, for those not interested in the first two.

In Sessions 1 and 2, the self-directed projects were based on working through examples from Code Academy that we had put from material already online on the website. In the self-directed track, students could work at their own pace with mentors on hand to work with them when they became stuck.

In Session 3, one of our session leads did not show up, and I was going to lead the other; at the request of students, I held a single afternoon session that involved working through various data science examples together as a class, and answered general questions about Python programming. It ended up being more of an extension of the morning lecture and a discussion of next steps vs. the projects we had imagined.

In the other tracks, student would download a prepared example in the form a of a zip file or tar.gz file. In each case, these projects would include:

  • All of the libraries necessary to run the examples (e.g., Tweepy for the Session 2 Twitter track).
  • All of the data necessary to run the example programs (e.g., a full English word list for the Wordplay example).
  • Any other necessary code or libraries we had written for the example.
  • A series of small numbered example programs. Each example program attempts to be sparse, well documented, and not more than 10-15 lines of Python code. Each program tried both to do something concrete but also provide an example for learners to modify. Although it was not always possible, the example programs tried to only used Python concepts we had covered in class.

On average, the non-self-directed afternoon tracks constituted of about 30% impromptu lecture where a designated lead mentor would walk through one or more of the examples explaining the code and concepts in detail and answering questions.

Afterwards, the lead mentor would then present a list of increasingly difficult challenges which would be listed for the entire group to work on sequentially. These were usually written on a whiteboard or projected and were often added to dynamically based on student feedback and interest.

Learners would work on these challenges at their own pace working with mentors for help. If the group was stuck on a concept or tool, the lead mentor would bring the group back together to walk through the concept using the project in the full group.

In cases, more advanced students could "jump ahead" and begin working on their own challenges or changing the code to work in different ways. This was welcome and encouraged.

In all cases, we gave students red sticky notes they could use to signal that they needed help (a tool borrowed from SWC).

Afternoon sessions received less positive feedback than the morning lectures: 35% of survey respondents rated them "Fair", 41% "Good", and the remaining 24% "Very Good". This was definitely our largest area for improvement; the biggest takeaway was ensuring lead mentors were more prepared.

Session 0: Python Setup

The goal of this session was to get users setup with Python and starting to learn some of the basics. We ran into the following challanges:

  • We chose to install Anaconda to make obtaining libraries for OSX and Windows users easier. This was a disaster over the wireless network, as 50 people tried to download the same 300MB file. We solved this by passing around USB sticks, which would have been better to prep in advance.
  • There was confusion as to whether Python 2.7 or Python 3 should have been installed. We wanted to use Python 2.7 but many attendees accidentally installed Python 3.
  • The wireless problems delayed the session quite a bit, so not everyone finished the exercises.
  • Students were getting confused between the OS shell and the Python shell.
  • By focusing on the Python shell, students weren't saving their work to files and were confused by the idea of running `python file.py` in the OS shell.
  • There were a few problems with the Code Academy exercises.

These things worked really well:

  • Sticky notes
  • Number of mentors: we had a lot, but not too many
  • Practising with the terminal, testing the Saturday projects and doing the follow-along tutorial had very few issues
  • Attendees really liked the "work at your own pace" set-up of the session

We took this feedback and incorporated it directly into a number of changes for our prepared Session 1, with much success.

Session 1: Introduction to Python

The goal of this session was to teach the basic of programming in Python. The curriculum for BPW has been used many times and is well tested. Unsurprisingly, it worked well for us as well.

We received 29 total survey responses for Sessions 0/1.

Morning lecture

About a third of students said the morning lecture went too fast. Half said it was just the right pace, and the remaining students said it was too slow. I adjusted my lecturing pace according to this feedback in future sessions, slowing things down a tad.

That said, there several things we will change when we teach the material again:

  • We might want to add a dictionary exercise during a lecture break. In general, mentors wanted to see more content in Session 1 focused on working with dictionaries.
  • It was suggested that we add candy/treat incentives to answering questions in lecture. We didn't end up doing this, but it's a good idea.

Afternoon sessions

The Session 1 afternoon sessions very were split in terms of student opinion.

The volunteer who had offered to put together the Shakespeare exercise was not prepared (finishing the lecture notes over lunch), and so the pace of their session was much too fast for the students. Furthermore, they introduced content I had explicitly asked them not to use yet (working with files). Yet the majority of students opted to attend the Shakespeare session, despite a warning that the curriculum was new and "experimental". In hindsight, we should not have held it in the main room, as many students stuck around as a result of inertia. We had a lot of students express confusion and frustration at the pace of this session in the survey feedback, and worried that some may have not returned to Sessions 2 and 3 as a result. Many of them switched to self-directed exercises on Code Academy as they lost track of the session. This was our lowest point in the workshop series.

The Wordplay session was rated very highly, and the pace was "just right", but many fewer students attended it. We attributed this to a less attractive "pitch" from the leader, which may have discouraged attendance. The mentor leading this session received a lot of positive feedback.

Our biggest takeaway was to ensure the afternoon session leaders were more prepared in advance, both in terms of curriculum and their session pitches. More coordination between myself (lecturer), session leaders, and mentors was necessary.

After this session, we learned to work better as a group and ironed out many of the kinks.

Session 2: Learning APIs

The goal of this session was to describe what web APIs were, how they worked (making HTTP requests and receiving data back), and how to use common web APIs from Wikipedia and Twitter. We received the fewest survey responses after this session: only 10 total. While we had fewer attendees, some mentors remarked that the sessions felt a lot more intimate and this seemed to encourage students to participate more as a group. Certainly, I received a lot more questions in the morning.

The gap between sessions was remarked upon as problematic by many attendees, and so a lot of them mentioned their appreciation for the morning review session I added.

One student remarked that this session was much better organized than Session 0/1, and was really impressed with the improvement (good -> great!).

Mentors had a discussion about using an IDE over the shell in the future. We reached a group consensus (with some notable dissenters) that it was important to focus on the shell and command line because we were trying to teach students cross-platform skills that would be helpful in the future, as opposed to a specific tool for specific tasks.

Morning lecture

On the advice of the CDSW reflections, we removed the section on JSON and only covered it very briefly.

Students loved PlaceKitten, although we ran into a bit of an issue on Windows with file formatting (need to specify binary write mode). This wasn't necessarily a bad thing, as the "demon kittens" of the garbled jpeg images and the solution we found later became a bit of an inside joke.

Timing became a bit of a problem. Students arrived 30 minutes late to the review session and many others arrived 30 minutes late to the morning lecture. We decided to just adapt our schedule instead of rushing people or cutting them off. While some attendees had to leave, the consensus was that this worked well for the group as a whole.

Afternoon sessions

We had two sessions, focusing on the Wikipedia and Web APIs. This time, no survey response claimed the afternoon session was too fast.

The Twitter session went very well. After going through a few examples, we introduced attendees to the documentation and let them work independently on their own ideas. The enthusiasm was tangible and the attendees got really creative! Unfortunately for us, Twitter had recently made a large API change, causing Tweepy to not work as documented in about half of cases we tried, which was very frustrating for attendees. There was a suggestion that we should prepare a few more canned examples for those individuals who can't come up with projects.

While not a problem, the Twitter session leader and I want to note the importance of cross-platform testing. We found a bunch of problems on Windows testing on her laptop before we took the exercises to the group, and this was really helpful.

The Wikipedia session was smaller, but very active. The entire group participated and worked together to build a "Catfishing" game with their session leader.

Session 3: Data Analysis and Visualization

The goal of this session was to get users to the point where they could take data from a web API and ask and answer basic data science questions by using Python to manipulating data and by creating simple visualizations. 17 students responded to our follow-up survey, although it was not isolated to Session 3 but covered the sessions as a whole.

Our philosophy in Session 3 was to teach users to get data into tools they already know and use. We thought this would be a better use of their time and help make users independent earlier. Based on feedback from attendees, we know that almost every user who attended our sessions had at least basic experience with spreadsheets and using spreadsheets to create simple charts. We tried to help users process data using Python into formats that they could load them up in existing tools like LibreOffice, Microsoft Excel, or Google Docs.

We did not hold another review lecture because we had recorded the Session 2 review, and encouraged participants to watch it the night before.

Normally, we'd only have about 3-4 mentors show up in the morning, but today, all 10 showed up in the morning, so there were way too many during the lecture. Again, the timing between sessions was problematic; students had forgotten a lot of their curriculum between workshops. Also, we ran out of stickies, which made it difficult for students to discreetly request help during the lecture.

This day was probably the most laid-back of all the sessions. By this point, the participants had formed some really strong friendships and they looked forward to spending time working together.

Morning lecture

Because it did not require installation of software and because it ran on every platform, we did sorting and visualization in Google Docs. Once we fetched and processed some data in Python, we imported it into Docs. This was a really great group exercise that we worked on together, and I noticed students really "getting it" as all our different tools started to come together.

One thing that was suggested for future lectures and exercises was to go over the entire thought process, from start to finish, in which we identified a problem, worked through a solution, and then coded it up. I thought this was a great idea, especially to introduce early (Session 2 morning? Session 1 afternoon?).

Afternoon session

One of our two afternoon session leaders didn't show up, so we just had one informal session in which I showed some examples, led an informal Q&A, followed up on the morning lecture, etc. Showing the matplotlib examples (while not requiring students to follow along, possibly avoiding frustration) was very popular, and it would have been cool to expand this to the whole class with some pre-generated data sets.

General Feedback

One important goal was help get learners as close to independence as possible. We felt that most learners did not make it all the way. However, the last session was one of our most successful as we tied up loose ends, talked about next steps, and gave our students the opportunity to really see some of the power of what they were working on. At UWaterloo, there is a lot of siloing between the faculties, especially based around technical skills. We felt that we really succeeded in helping to break down that barrier, and show students from less quantitative backgrounds that these skills were not the exclusive domain of math and engineering.

  • The spacing between sessions too large, and we knew this might be a problem from CDSW. Nonetheless, we were constrained by the amount of free time the volunteers had to contribute. I believe the CDSW have moved to workshops with only a week gap between them. This would be great, but we currently don't have the resources for it.
  • A big difference between our PWFB and the CDSW is that our group consisted entirely of students. As a result, our mentors were a bit less confident and we had less time to prepare as a group and organize curriculum between classes. On the other hand, I think this was hugely beneficial for the mentors in a way they may not have expected, and proportional to that of the attendees; many volunteers got to flex their teaching muscles for the first time, and many gave their first lectures.
  • The general structure of the entire curriculum was not as clear as it might have been which led to some confusion. We had one untested afternoon session that introduced far too much complexity and may have intimidated many of the attendees. Though the workshops had been run before, Elana had only mentored at Session 0/1 and all of the sessions were new to the Waterloo volunteers. If we were to run the workshops again, it would definitely be easier as a lot of the bumps have been ironed out.
  • We ran into a few bumps with Python on Windows (binary mode for writing placekitten files was a problem), but not nearly as many as the CDSW had. Using Anaconda may have helped, here. As well, we had a lot of mentors familiar with Windows, which CDSW did not have.
  • A majority of the participants in the session were women and identified as a visible minority. We had a diverse mix of volunteers in terms of gender, race, and ability, which really seemed to encourage our diverse participants.
  • The SWC-style sticky notes worked extremely well but were used less, and seemed to have less value, as we progressed. Running out didn't help!
  • Students really appreciated the collaborative environment that we fostered, and many remarked on the great experience they had making new friends and solving puzzles together.

At the suggestion of the CDSW, we spent some time teaching debugging and finding/reading documentation as part of our exercises. This seemed to work very well.

Budget

We budgeted $10 for supplies (sticky notes), but only spent $5. In the mornings we bought coffee for $35/$50/$35 (we didn't have enough for the first session, but then attendance dropped). For lunch we spent between $250 (pizza and soft drinks, plus non-pizza for those with dietary restrictions), $310 (sandwiches), and $500 (fancy Indian food). This was for 50 students and ~10 mentors, though we had leftovers in the last two sessions. We tried to avoid doing pizza the entire time, while still keeping to a stricter budget, and attendees raved about the high quality of the free food.

We spent $260/$130/$240/$150 (varying based on the restaurant visited and number of attendees) on our mentor dinners, which were very well-attended and very popular. It was a great opportunity for our mentors to bond and go over the day's feedback and it was a bit of an adventure splitting food between the group of 10! Having a bit of extra budget was very worthwhile for these events, but you could likely do them on $150/dinner for about 10 people.

All of our food was generously sponsored by the Python Software Foundation. The rooms were provided for free by School of Computer Science at the University of Waterloo.

Our total cost came to ~$2000 CAD.