Note: This is a proposal being submitted by Tracy Teal (@tracykteal) for PyCon '14. I suggested she post it here for feedback, because she does not have her own blog. --titus
TITLE: How I learned to stop worrying and love matplotlib
DURATION: I prefer a 30 minute time slot
I was living a dual life, programming in Python and plotting in R, too worried to move R code to Python. Then I decided to make 100 plots in Python in 100 days. I documented the journey on a website, posting the plot & code in IPython Notebooks and welcoming comments. This talk will summarize lessons learned, including technical details, the process and the effects of learning in an online forum.
Scientists interested in statistical computing with Python, those interested in learning more about NumPy and matplotlib.
PYTHON LEVEL: Beginner
Attendees will see use cases for numpy and matplotlib, as well as one approach on how to succeed (or fail) at challenging yourself to learn something new.
Many scientific programmers use multiple languages for different applications, primarily because specific packages are available for their standard use cases or they're working with existing code. While these languages work well, it can limit the ability to integrate different components of a project in to one framework. The reasons not to use numpy, matplotlib and pandas is therefore often not technical, but the effort required to learn or develop a new approach when there are already so many demands on a scientist's time can be inhibiting. Additionally the development of new packages or integrated code bases are often not as valued in the academic structure.
I am one of those scientists, a microbial ecologist and bioinformatician, writing most of my code in Python and teaching it in Software Carpentry, but doing all my statistics in R. I like R and the R community and in particular, the ecological statistics package, vegan, so I haven’t felt the need to switch, but I realized my reluctance was mainly because I didn't know how to do the same things in Python, not that R was necessarily better for my workflow. So, I figured I should at least give it a try, but it was always a task on the back burner and not particularly interesting. Inspired by Jennifer Dewalt's 180 web sites in 180 days, the idea of making something in order to learn particular skills and the process of deliberate practice, I decided to start a project 100 plots in 100 days. In this project I will make a plot every (week)day for 100 days using Python. Plots encompass y=x to visualizations of multivariate statistics and genomic data. I use matplotlib, numpy and pandas, make the plots in IPython Notebook and post the notebook and comments about the process of creating that plot on my blog. I welcome comments to get more immediate feedback on the process.
This talk will focus on lessons learned during the project, both technical and about the process of learning - the expected and unexpected outcomes and how the involvement of community impacts practice.
- Intro (5 min)
- Who am I?
- Why this project?
- Show the website
- Lessons learned (18 min)
- Technical lessons learned
- numpy/matplotlib tricks or tips
- any new statistical algorithms developed for numpy
- Lessons learned about learning
- Was this process good for learning something new? Why/ why not?
- Deliberate practice has been shown to be the most effective way to get good at something. It involves working at something and getting feedback. Was this approach good for that?
- Social aspects
- Response to the project
- Social pressures and accountability - does saying you'll do something publicly make you more likely to do it
- Concluding remarks (2 min)
- Would I do something like this again? Would I recommend it?
- Questions (5 min)
- I'm just starting this project, inspired by both a recent Hacker News post on Jennifer Dewalt's 180 web sites in 180 days and the opportunity to present at PyCon. As such, at review time, I'll only be beginning the journey. Success for me for this project would be following through on the 100 plots in 100 (week)days, learning the fundamentals of numpy and matplotlib and making some neat and useful plots along the way. I'll share all the code for the statistics and each plot on the website, however ugly it may be. This could fail too. I could not be able to get beyond variations on a y=x plot and write terrible code. This talk will document both the success and the failures, as I hope I and others can learn from both. I do understand the risk of accepting a talk like this where I can't yet tell you what the lessons learned will be.
- This would be my first time speaking at PyCon. I've spoken at many scientific conferences, been selected as an Everhart Lecturer at Caltech and received "Best Presentation" award at conferences. I've also been an instructor for five Software Carpentry bootcamps, including one for Women in Science and Engineering.