Day 1 pre-lecture

From OpenHatch wiki

Why Programming and Data Science

The question of who controls our technology, our information, and our data, is increasingly the question of who controls our experience of the world and each other. Programming is the power to define technology. It can be in, this sense, deeply empowering.

In a technological and data driven world, being able to programming and data science is a kind of literacy. Imagine a world in which everybody could read by only some people could write?

Our goal here is not turn you into the programming equivalent of novelists or journalists. Our goal is to demystify things and give you enough information to become dangerous.

Programming, you will also find — probably a little today and a lot more later on — is also enormously fun. For me, it's like meditation and problem solving. It's exactly as frustrating as a difficult puzzle and even more rewarding because your solution accomplish something else you were trying to do.

Why Python

I know a dozen programming languages and write 4-5 regularly. But Python is the right one.

Python is a fantastic language to learn

Believe it or not, compared to other programming languages:

  • Python has a low "syntatic overhead".
  • It's easy to get work done quickly.
  • It's relatively forgiving.

Python is versatile useful for a range of applications

There are easier programming languages to learn. But Python is important because it is not a toy. In designing the curriculum for these workshops, we have tried to only teach tools that we, as professional data scientists and programmers, use ourselves and find useful.

Python is used for:

  • Web applications (Instagram, Pintrest, and the Washington Post all run websites written largely in Python).
  • Python can be used to extend existing applications. You can use it to script many graphical applications.
  • Python is fantastic for dealing with and manipulating text.
  • Python can be used to build graphical games (Frets on Fire)
  • Python really shines when it comes to dealing with data and with the web.

Housekeeping Notes Before We Begin

  • Several people from eSciences are here to help learn about how folks are learning data science. They are also here to learn and teach. If you're bothered by their presence, let them know or let me know.
  • I am beyond proud to announce that we have at least two mentors this time who were enrolled as students last time!
  • We have a few mentors from last time who will be running sessions and projects this time!


  • Broad overview: three sessions
  • Lecture
  • Lunch over in CMU Building
  • Projects until 3:30: thre projects, think about them

Final Notes

  • Choose sessions: Wordplay; Baby Names; Code Academy (and then Wordplay)
  • Food will be out in the atrium outside CMU 126. I'm told it's here.
  • We will get rooms divided up and put them on the whiteboard in CMU 104. Go there to see where things are.