Community Data Science Workshops: Difference between revisions

→‎What is the CDSW: fix formatting bugs
imported>Mako
(draft of description)
imported>Mako
(→‎What is the CDSW: fix formatting bugs)
Line 12:
== What is the CDSW ==
 
The '''Community Data Science Workshops'' are a series of project-based workshops for anyone interested in learning how to use programming and data science tools to ask and answer questions about online communities like [https://en.wikipedia.org Wikipedia], [http://twitter.org Twitter], [[:wikpediawikipedia:free software|free and open source software]], and [[:wikipedia:civic media|civic media]].
 
The workshops are for people with ''no previous programming experience.'' The goal is to bring together both researchers and academics as well as participants and leaders in online communities. The workshops have all been free of charge and are open to the public.
Line 20:
'''Introduction to Programming (Session 1)''' — Programming is an essential tool for data science and is useful for solving many other problems. The goal of this session will be to introduce programming in the Python programming language. Each participant will leave having solved a real problem and will have built their first real programin their group. We will be relying on the curriculum from the Boston Python Workshops. Because we expect to hit the ground running, we will also run a session in the evening of '''the preceeding Friday (Session 0)''' to help participants get software installed.
 
'''Importing Data from Web APIs (Session 2)''' — An important step in doing data science is collecting data. The goal of this session will be to teach participants how to get data from the public [[:wikipedia:Web API|application programming interfaces]] (“APIs”) common to many social media and online communities. Although, we will use the APIs provided by Wikipedia and Twitter in the session, the principles and techniques are common to many online communities.
 
'''Data Analysis and Visualization (Session 3)''' — The goal of data science is to use data to answer questions. In our final session, we will use the Python skills we learned in the first session and the datasets we’ve created in the second to ask and answer common questions about the activity and health of online communities. We will focus on learning how to generate visualizations, create summary statistics, and test hypotheses.
Anonymous user