PyCon-2015-sprint-wiki
Below are a list of projects sprinting at PyCon 2015 and some of the tools and skillsets they suggest that new sprinters have. *Required* tools are skillsets are specified/marked - generally speaking these are things that are nice to have, and will make sprinting easier. This list of projects is not exhaustive - there are plenty of sprinting projects not on this list.
You can find more information about the sprints here.
Projects
OpenHatch
At this workshop? Shauna Gordon-McKeon, Carol Willing, Asheesh Laroia
List of useful skills: We have several projects you can work on, including the curriculum-focused Open Source Comes to Campus, our python IRC bot, and our main website, written in Django. Familiarity with python, Django, and open source tools/culture are all useful, but we'll teach you anything.
People of all experience levels welcome: Yes
Details about sprint tasks (if supplied): There will be coding, documenting, UX, design, educational and community building tasks available! A nowhere near complete list of some of the code-focused tasks we'll be working on is here: https://github.com/openhatch/oh-mainline/milestones/Sprints
Django + Django Girls
At this workshop? Baptiste Mispelon
List of useful skills: No required technical knowledge or skills. Reading about the contribution process would be useful. Ability to work independently is great. Cursory knowledge of git/pull requests would come in handy.
People of all experience levels welcome: yes
SageMath
At this workshop? No
List of useful skills: Some mathematical knowledge and interest is highly useful.
Details about sprint: "This year at PyCon, we plan to have Sage Days during the sprints, so we will have some ""sprint sessions"" with other PyCon attendees as well as Sage presentation talks nearby."
More informations on [1].
morning sessions at UQAM, room PK-3605, Pavillon Président-Kennedy, 201, Président-Kennedy
Stack Storm
At this workshop? No
List of useful skills: General familiarity with linux / monitoring and other ""DevOps areas"" is required.
Mailman
At this workshop? No
List of useful skills: Please have a general understanding of version control, issue trackers, and writing and testing python. Please have SSH and GPG set up.
Open edX
At this workshop? Ned Batchelder
List of useful skills: Developing and debugging Python code, Writing Python unit tests, Basic web development paradigms (knowing the difference between client and server, knowing about HTTP requests, etc, How to use Github to make pull requests. For people who want to work on parts of the project that are user facing, you also need to know: Basic HTML and CSS, Javascript and require.js, Jasmine for testing Javascript, Ideally, Backbone.js for writing MVC-style Javascript. Happy to help people with learning about testing and Javascript.
Pandas(?)
At this workshop? No
List of useful skills: Basic knowledge of library. Please try to install the project ahead of time.
Falcon
At this workshop? No
List of useful skills: Sprinters of all skill levels are welcome. Contributors just need a basic understanding of Python and HTTP. Experience with REST APIs and web apps is a plus, but not required. Likewise, experience tuning Python code for performance is helpful, but is by no means required. In fact, if you'd like to learn more about any of these topics, Falcon is a great place to start.
We'll be working on completing our 0.3 milestone.
Khmer
At this workshop? No
List of useful skills: Needed: either some Python or C++ experience on the coding side. Help from technical writers is highly appreciated! No biology background needed.
Redislite
At this workshop? Dwight Hubbard
No coding experience required
List of useful skills: Writing Documentation, User Testing, Unit Testing, Redis, Celery, RedisPy, C, Django, Git, VirtualEnv, Tox, Sphinx
Details about sprint tasks:
- Close existing issues on the issue tracker.
- Try and complete Enhancement requests for the Pycon2015 Milestone on the issue tracker
- Work through and review the documentation and fix issues.
Tryton
At this workshop? No
List of useful skills: Python, XML format, Mercurial, Database (PostgreSQL or SQLite)
PyLadies
At this workshop? Carol Willing
List of useful skills: No experience necessary. To work on project, will need python and virtualenv installed.
People of all experience levels welcome: Yes
Details about sprint tasks (if supplied): We'll be working with: APIs, Data analytics & visualization (Python, IPython notebooks, Pandas, d3.js, etc), Python-based websites (e.g. Flask, Django), Front-end knowledge (CSS, JS, HTML), Documentation (rST markup, Sphinx, read the docs)
Hey Duwamish!
At this workshop? yes
Days Sprinting: Tuesday, Wednesday morning
List of useful skills: Git, Javascript, documentation, design, Django. And of course, Python! I am happy to provide training on just about anything else for the project.
PyKinect2
At this workshop? '
List of useful skills: Hope people know Python and are excited about using Kinect in their projects/experiments.
Quantifec Code
At this workshop? No
List of useful skills: "They should only be familiar with Python and be passionate about code quality!"
People of all experience levels welcome: Yes
Details about sprint tasks (if supplied): At QuantifiedCode we try to build the best Python code checker in the galaxy, so we want to discuss with people how we can help them to write better code and make them more productive. So what we propose is a ""Python anti-pattern"" sprint, where people can talk about best practices and anti-patterns that they encounter in their work and we will try to automate the detection of these patterns in their projects. Like this, open-source projects will have a way to provide new contributers with coding guidelines tailored to their needs and conventions, and thus make it much easier for people to contribute good, readable and correct code to their projects.
CPython
At this workshop? No
List of useful skills: Some C or Python or both.
IDLE Reimagined & Tkinter Tutorial App
At this workshop? No
List of useful skills: "Python beginners and educators are welcome to provide input on how to make IDLE a better educational tool for beginners.
People of all experience levels welcome: Yes
Details about sprint tasks (if supplied): In prep for these IDLE changes, I'd like to make a demo app on tkinter widgets (similar to wxPython's Demo app) and how to work with the tkinter module. Those with tkinter experience are very welcome.
Mochi
At this workshop? No
List of useful skills: Language design, testing, documentation, functional programming
JyNI
At this workshop? No
List of useful skills: "Participating is only feasible for people that can already at least roughly read C-code. Experience in C programming is recommended, but there are also refactoring and code-arrangement and cleanup-tasks that can be done by moving existing code around; that's why reading C-code is already enough for some tasks. In addition to improving C-skills I provide guidance for and insight into Python's C-extension API and also into Jython, how to use it, how it works."
streamparse
At this workshop Andrew Montalenti
List of useful skills: Python; interest in Apache Storm / Apache Kafka; stream processing, data analytics.
Details about sprint tasks (if supplied): We'll be writing a Python Topology DSL for Apache Storm. This is a generic way to specify a direct acyclic graph of computation for data pipelines, which can then run Python code remotely on a cluster of machines -- thus defeating the GIL and allowing true concurrency. The plan is to use some fun Python features in order to write a good-looking DSL, e.g. I suspect metaclasses and descriptors will be involved.
Resources: streamparse on Github; PyCon2015 video presentation on streamparse; HTML notes on streamparse slides; core Github issue we'll hack on
Center for Open Science
At this workshop No
List of useful skills: Python; JavaScript; web development.
People of all experience levels welcome: Yes
Details about sprint tasks (if supplied): The Open Science Framework (OSF) supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery. It is developed by the Center for Open Science (COS.io). Many members of the COS team are here, and they can support various projects related to the open science initiative. We covered some of the things we might sprint on in the Lightning Talks. The video is at https://www.youtube.com/watch?v=yws4n-0-Yj8 and you should watch the two talks at 9:45 then the one at 47:30.
Resources:
- Lightning talk materials - https://osf.io/3winr/
- Waterbutler Github Repo - https://github.com/CenterForOpenScience/waterbutler
- Waterbutler Docs - http://waterbutler.readthedocs.org/en/latest/
- Waterbutler How-To Notebook - http://nbviewer.ipython.org/gist/chrisseto/4e8ef20dc6465cdfcdb1
- Modular ODM Repo - https://github.com/CenterForOpenScience/modular-odm
SHARE:
- SHARE Notification Service Repo - https://github.com/CenterForOpenScience/share
- Creating a metadata harvester for SHARE - https://osf.io/wur56/wiki/Creating%20a%20Harvester/
- Elasticsearch API - http://osf.io/api/v1/share/search/?raw
- Command line tool for visualizing SHARE data - http://github.com/erinspace/scrapi_stats
- Tool for converting OOXML files to HTML - https://github.com/CenterForOpenScience/pydocx
- Gist for adding text storage instead of using Cassandra - https://gist.github.com/fabianvf/597f57ffe8351156bb98
- Any cool descriptives, stats, viz would be great to consider, but a list of top keywords (showing variance), % dois by provider, % field by provider. For any graphs you make, also send Erin the data--whatever table I'd need to recreate (so, the number on the y to get each bar on x)
- If you would like to work on a visualization, please just leave your name and the visualization you are working on here, so that we can avoid duplicating effort. To start us off, here are a few unclaimed visualizations/statistics:
* Analyzing what keywords appear the most across services * Analyzing identifiers that appear across services (dois, urls, etc) * Analyzing contributors that appear across services * Top keywords/contributors/titles/identifiers/etc * Analyzing the number of providers that include certain fields * Histograms would be a nice addition to the command line tool
New sources:
We can search for new sources via OpenDOAR: Directory of Open Access Repositories - http://opendoar.org/ - We need sources that are licensed CC0. OpenDOAR has an API that allows searching by subject, metadata licensing state, existence of an OAI url and others.
- API documentation - http://www.opendoar.org/tools/api.html NOTE: the PDP contains more information about search parameters, read that first
- Small Python script that uses the above API to query for all sources that are in English, have science content, and allow free access to their metadata; the script parses the XML output and returns a JSON-formatted list of dicts containing the repository name, main URL, and OAI URL: https://gist.github.com/stitchinthyme/dfeac2c8579bbd2d2fb0
Budapest Open Access Inititive - searching for BOAI or looking on this page for sources dedicated to open accesses to data http://www.budapestopenaccessinitiative.org/list_signatures
- CalTech Library - http://caltechs.library.caltech.edu/cgi/oai2
- Harvard: Digital Access to Scholahship at Harvard - http://dash.harvard.edu/
- Oklahoma State Thesis and Dissertation Archive - http://www.library.okstate.edu/thesis/
- Oklahoma Library: General archive, some non-science content - http://www.library.okstate.edu/digital/
- Aberdeen University Research Archive - http://eprints.aston.ac.uk/cgi/oai2?verb=Identify
- Digital Commons Network - http://network.bepress.com
- Birkbeck Institutional Research Online - general archive, science and non-science content - http://eprints.bbk.ac.uk/cgi/oai2?verb=Identify
- Bournemouth University Research Online - http://eprints.bournemouth.ac.uk/cgi/oai2?verb=Identify (general - science at http://eprints.bournemouth.ac.uk/view/subjects/sci.html)
- Bradford Scholars - general archive, science and non-science content - http://bradscholars.brad.ac.uk/dspace-oai/request?verb=Identify
- Canterbury Research and Theses Environment - http://create.canterbury.ac.uk/cgi/oai2?verb=Identify (general - science at http://create.canterbury.ac.uk/view/subjects/Q.html)
- CEDA (Centre for Environmental Data Archival) - http://cedadocs.badc.rl.ac.uk/cgi/oai2?verb=Identify
- CentAUR (Central Archive at the University of Reading) - general archive, science and non-science content - http://centaur.reading.ac.uk/cgi/oai2?verb=Identify
- CADAIR (Aberystwyth University Repository) - http://cadair.aber.ac.uk/dspace-oai/request?verb=Identify
- William &Mary Virginia Institute of Marine Science - https://digitalarchive.wm.edu/handle/10288/615
- ARRO (Anglia Ruskin Research Online - general archive, includes science) - http://angliaruskin.openrepository.com/arro/oai/request?verb=Identify
- University of California - http://escholarship.org/
- Aston University Research Archive - http://eprints.aston.ac.uk/cgi/oai2?verb=Identify
- Cognitive Sciences ePrint Archive - http://cogprints.org/cgi/oai2?verb=Identify
- Central Lancashire Online Knowledge - http://clok.uclan.ac.uk/cgi/oai2?verb=Identify
- City University Research Online - http://openaccess.city.ac.uk/cgi/oai2?verb=Identify (general - science at http://openaccess.city.ac.uk/view/subjects/Q.html)
Influence-USA
At this workshop: Bob Lannon
List of useful skills: Python, web scraping, record linkage, computer vision
Details about sprint tasks (if supplied): We're working on scraping campaign finance and lobbying data that are publicly available on state government websites, to bring them all together in one central, public domain datacommons.
Resources: http://influence-usa.github.io
No Null Process
Inspired by Kate Heddleston's talk "How our engineering environments are killing diversity (and how we can fix it)," this repo contains basic starter docs/checklists for company processes. It is designed to be forked and changed to fit your company's needs. In addition to promoting a more diverse personnel, we believe that having processes like these will improve the experience for all new hires.
Info about the intro event
5:30-5:45 Pre-event
Attendees and mentors trickle in. Mentors update the sprint wiki (see above) as needed, and socialize with attendees.
5:45 - 6:00: Intro
Led by Shauna and Naomi. We'll talk about what the workshop will cover, and what the sprint schedule itself is. We'll be soliciting stories from mentors and others in the audience about past experiences sprinting, to help students picture what the next couple of days will be like.
6:00 - 7:00 Tools
TOOLS Intro by Shauna, then small groups. We'll cover common open source tools, focusing on those mentioned by sprint leaders as useful for contributing to their projects. The first 15 minutes will be spent as a group doing overviews, then for the next 45 minutes we'll break into groups to cover specific topics in more depth.
I'm looking for mentors to cover any topic marked with a ?. Please feel free to add yourself to mentoring groups even if there's already a mentor or two specified. You can also suggest additional topics but I can't promise we'll cover them.
- IRC, Issue trackers, and beginning Git - Shauna, with help from Naomi Ceder
- Unit tests & testing generally - Ned, Liav
- Virtualenv - Eeshan Garg & Dustin J. Mitchell
- Advanced Git - Trey & Sam
- HTTP basics - Asheesh Laroia,
- Virtualbox/Vagrant Philip J.
Optional additional topics: ssh,
7:00-7:20/30 Unsticking Yourself
Led by Shauna. Group brainstorm of ways you might get stuck and how to unstick yourself. Prompts:
- not being sure what to work on
- not sure where to make changes in the code
- not being sure whether you've satisfied an issue
- your changes break tests
- you can't figure out what an error message means
7:30-end Project Intros
Project leads spend 2-5 minutes each describing their project, what they're planning to work on, and what kind of skills sprinters need. Afterwards, leads are encouraged to get dinner with potential sprinters.
Useful announcements
Sprint Lunch: 2-3PM tomorrow