Note: This is a proposal being submitted by Tracy Teal (@tracykteal) for PyCon '14. I suggested she post it here for feedback, because she does not have her own blog. --titus
TITLE: How I learned to stop worrying and love matplotlib
DURATION: I prefer a 30 minute time slot
I was living a dual life, programming in Python and plotting in R, too
worried to move R code to Python. Then I decided to make 100 plots in
Python in 100 days. I documented the journey on a website, posting the
plot & code in IPython Notebooks and welcoming comments. This talk
will summarize lessons learned, including technical details, the
process and the effects of learning in an online forum.
Scientists interested in statistical computing with Python, those
interested in learning more about NumPy and matplotlib.
PYTHON LEVEL: Beginner
Attendees will see use cases for numpy and matplotlib, as well as one
approach on how to succeed (or fail) at challenging yourself to learn
Many scientific programmers use multiple languages for different
applications, primarily because specific packages are available for
their standard use cases or they're working with existing code. While
these languages work well, it can limit the ability to integrate
different components of a project in to one framework. The reasons
not to use numpy, matplotlib and pandas is therefore often not
technical, but the effort required to learn or develop a new approach
when there are already so many demands on a scientist's time can be
inhibiting. Additionally the development of new packages or
integrated code bases are often not as valued in the academic
I am one of those scientists, a microbial ecologist and
bioinformatician, writing most of my code in Python and teaching it in
Software Carpentry, but doing all my statistics in R. I like R and
the R community and in particular, the ecological statistics package,
vegan, so I haven’t felt the need to switch, but I realized my
reluctance was mainly because I didn't know how to do the same things
in Python, not that R was necessarily better for my workflow. So, I
figured I should at least give it a try, but it was always a task on
the back burner and not particularly interesting. Inspired by
Jennifer Dewalt's 180 web sites in 180 days, the idea of making
something in order to learn particular skills and the process of
deliberate practice, I decided to start a project 100 plots in 100
days. In this project I will make a plot every (week)day for 100 days
using Python. Plots encompass y=x to visualizations of multivariate
statistics and genomic data. I use matplotlib, numpy and pandas, make
the plots in IPython Notebook and post the notebook and comments about
the process of creating that plot on my blog. I welcome comments to
get more immediate feedback on the process.
This talk will focus on lessons learned during the project, both
technical and about the process of learning - the expected and
unexpected outcomes and how the involvement of community impacts
- Intro (5 min)
- Who am I?
- Why this project?
- Show the website
- Lessons learned (18 min)
- Technical lessons learned
- numpy/matplotlib tricks or tips
- any new statistical algorithms developed for numpy
- Lessons learned about learning
- Was this process good for learning something new? Why/ why not?
- Deliberate practice has been shown to be the most effective way to get good at something. It involves working at something and getting feedback. Was this approach good for that?
- Social aspects
- Response to the project
- Social pressures and accountability - does saying you'll do something publicly make you more likely to do it
- Concluding remarks (2 min)
- Would I do something like this again? Would I recommend it?
- Questions (5 min)
- I'm just starting this project, inspired by both a recent Hacker
News post on Jennifer Dewalt's 180 web sites in 180 days and the
opportunity to present at PyCon. As such, at review time, I'll only
be beginning the journey. Success for me for this project would be
following through on the 100 plots in 100 (week)days, learning the
fundamentals of numpy and matplotlib and making some neat and useful
plots along the way. I'll share all the code for the statistics and
each plot on the website, however ugly it may be. This could fail
too. I could not be able to get beyond variations on a y=x plot and
write terrible code. This talk will document both the success and
the failures, as I hope I and others can learn from both. I do
understand the risk of accepting a talk like this where I can't yet
tell you what the lessons learned will be.
- This would be my first time speaking at PyCon. I've spoken at many
scientific conferences, been selected as an Everhart Lecturer at
Caltech and received "Best Presentation" award at conferences. I've
also been an instructor for five Software Carpentry bootcamps,
including one for Women in Science and Engineering.
There are comments.