Wed, 28 Feb 2007

Intermediate and Advanced Software Carpentry with Python


I have been asked to submit an outline for a three day course on Python, for ~20 scientists who already know basic Python. On fairly short notice, I came up with the following; what am I missing? (I plan to make the course materials publicly available, of course.)

(Note that I was explicitly asked about teaching IDLE.)

Outline

Three days: plan for three hours instruction, three hours hands-on, plus breaks.

Day 1

Goal: Ensure that participants understand how to build re-usable Python
code & design for re-use and maintenance.

Straight Python:

  • building Python programs and laying out packages
  • writing for reusability
  • maintaining Python codebases & testing
  • advanced features of the Python language
  • a brief intro to extending Python with C/C++

This day will be devoted to exploring people's knowledge about Python, and can be adjusted dynamically to provide more basic or more advanced information.

Day 2

Goal: Introduce participants to the variety of (excellent!) tools for
working with Python, esp in science.

Tools

  • Wrapping C/C++ code automatically
  • NumPy/SciPy
  • Rpy, matplotlib: tools for plotting
  • UNIX tools to help you develop and collaborate: screen, VNC
  • IDLE/IDEs
  • Centralized and distributed version control
  • Trac project management
  • IPython interactive Python interpreter

This day will explore the variety of tools for effectively working with and reaching out from Python.

Day 3

Goal: Provide hands-on experience with automatically producing static
and interactive views of your data and analysis results.

Databases, data analysis, and data presentation

  • Storing data in a structured manner

  • Built-in Python options (shelve/bsddb)

  • Using SQL
    • SQLite
    • MySQL/PostgreSQL
  • Building static HTML output

  • Building dynamic HTML output with CGI/CherryPy

  • Tying the database into your Web server

  • Testing your Web stuff

This day will introduce people to effective techniques for data storage and presentation with Python.

(A whole day might be needed because of the variety of topics: both HTML and SQL must be introduced!)

The menu of topics

Building reusable code:

  • modules, globals vs locals, import issues
  • PYTHONPATH
  • building/installing packages: distutils, easy_install, 'require'

Testing

  • doctests, unittests, test fixtures
  • more advanced unit testing tools: nose/py.test
  • code coverage/figleaf

Simple database stuff

  • pickling
  • bsddb/shelve
  • SQL, sqlite, and MySQL/PostgreSQL
  • Durus/ZODB: object databases

Docstrings and automatic generation of documentation

Building Python interfaces to C and C++ code

  • writing simple interfaces manually is easy
  • SWIG, Boost.Python, SIP: examples & tradeoffs
  • C++ special stuff
  • testing C code from Python

Java/Jython

.NET/Mono/IronPython

NumPy/SciPy

matplotlib, a matlib-type Python graphing/display system

Rpy, Python interface to R

Generators, iterators, yield, list/generator comprehensions

The lesser known (but useful!) corners of the Python stdlib

File management and APIs: how to deal nicely with paths, data files,
etc.

Using subprocess to flexibly execute external programs.

IPython interactive Python prompt

Another way to develop: scripting with two windows

XML parsing

Generating HTML for analysis summary and presentation

The logging package: logging and py.logging

Python interfaces to MPI

Concurrency and threading in Python: threading vs fork vs...; the
Global Interpreter Lock

py.lib sshexec, a flexible way to run programs on multiple computers

How Python is developed and how to think about backwards/forwards compatibility

IDEs: IDLE

Building simple Web servers (with CherryPy, probably? Or CGI.)

A brief introduction to GUI development in Python.

UNIX tricks: screen, VNC

pdb, the Python debugger

Building your own types: using dicts and lists as interfaces to your own
data; advanced dictionary use.

Version control with subversion, darcs, bzr-ng

Project, ticket, and timeline management with Trac.

posted at: 12:08 | path: /feb-07 | 9 comments

Tags:


"Stupidity Driven Testing" and PyCon '07


Of all the fever-induced hallucinatory things I said at PyCon '07, I'm proudest of this: "I don't do test-driven development; I do stupidity-driven testing. When I do something stupid, I write a test to make sure I don't do it again."

So true.

For readers that don't get it, my development practice is this:

  1. write code to solve some problem
  2. watch code break in some obvious way
  3. write a test that tests that specific breakage
  4. lather, rinse, repeat.

I don't mind making mistakes, even stupid ones: I just don't want to repeat them. Thus, this development technique.

General comments on PyCon:

I did something every day: tutorial, panel moderator, panel-ee, and speaker. Way too much work, stress, and time spent in preparation.

Meeting people was great. I got to meet & hang out with the ARINC people, Shannon ("jj") Behrens, James Taylor of Galaxy, Terry Peppers, and a bunch of other people that I knew only from e-mail. I also saw a bunch of faces from last year, of course, including Brian Dorsey (thanks for the lift, Brian & Kirk!)

The talks were (from my limited vantage point) much better this year than last year. This is presumably a reflection of the increasing size of the Python community.

The "nose vs py.test" debate is growing in size, if not reaching any actual conclusion. It's very clear to me, at least, that these are the big new testing tools; I'm (obviously) pushing for nose, but I'd really like to see a showdown of features so that I can convert this from flame-boy advocacy into informed advocacy.

I regret not attending more Birds-of-a-Feather sessions.

The keynotes were fantastic, in general. I didn't enjoy the education one, per se, but a lot of interesting stuff was said.

I hope the recordings are up soon.

PyCon: Day 0 (tutorial)

The tutorial day was, as usual, fun! Grig and I gave our testing tutorial and even though we felt less prepared than last year, I think our significantly increased experience with actually using these tools (see the ARINC talk in particular) showed.

Reviews (both positive) by Shannon -jj Behrens and Terry Peppers.

Next year, we should have a book or two out on these topics, which will be an entertaining addition.

PyCon: Day 1 (Web panel)

I spent most of this day sweating about the PyCon Web Panel, which in any event turned out fine. Once I finally worked out the format for the panel in my own head (2 minutes introductions by me, followed by questions spread evenly among the participants) I was much more relaxed about things. (Perhaps the most fun I had with this aspect was reciprocating Grig's prodding: he constantly told me that I was over-preparing, and then when it was his turn for the Testing Tools panel I got to prod him for over-preparing. Back atcha... ;)

The panel was really meant to showcase personalities and get faces out there; 45 minutes is way too short for any meaningful discussion.

Being in front of that many people made me really freakin' nervous.

One obvious (to me) conclusion from the panel was that TurboGears and Pylons should merge. This may happen eventually, but not right now ;).

Another obvious conclusion (and I actually said something to this effect) was that documentation is a huge problem. Huge. The framework that documents will dominate IMO. (Right now I'm guessing that this will be Django, but only because Adrian consistently acknowledged the need for documentation.)

It was interesting to discover that Twisted had AJAX-like behavior a year or two before AJAX hit. I think Zope and Twisted both need to hire a PR expert to publicize their coolness; I get the impression that the communities are relatively insular and this contributes to a lack of buzz about their accomplishments.

My favorite comment, by Jonathan Ellis: "Django's ORM is feeble."

James Bennet has a disturbingly complete transcript.

Other reviews/notes: Jonathan Ellis, James Tauber (international man of mystery!), Nathan Yergler (I agree, Nathan! But I asked for more time!), Shannon ("jj") Behrens, and Matt Harrison.

Hopefully a video of this event will be posted. I want to listen to what I actually said. ;)

PyCon: Day 2 (Testing Tools panel)

I wasn't as worried about the Testing Tools panel, 'cause I didn't have to say anything. Of course, it turns out I said too much as a result ;).

The panel was fun but kind of a blur. A bit more time next year, perhaps?

Reviews/notes: Grig Gheorghiu, and Matt Harrison.

I attended the buildbot BoF, which was really fun. Brian Warner rocks.

PyCon: Day 3 (twill talk)

I spent most of the prior evening and morning working on my twill/scotch/figleaf talk. During this time I learned just enough about CherryPy 3.x and Django whatever-the-heck-the-latest-version-is-with-or-without-magic-removal-who-the-hell-knows to actually write test fixtures for them.

I decided to go out on a limb and rather than describe twill/etc. in nauseating detail I worked up nine demos (testing CherryPy sites, doing coverage analysis, writing twill extensions, testing Django sites, and recording Web traffic) and I ran through the demos interactively while providing a narrative.

I really enjoyed this talk format, although it may not be for everyone.

You can grab my talk source code here although this link will eventually (soon!) be broken & moved to an archive containing more documentation.

Review: The Thiers.

I announced the testing-in-python (TIP) mailing list in my talk.

Grig's pybots talk was well-received and (IMO) I think this project is going to dramatically increase the solidity of the Python community's software.

I also got a chance to run some of my ideas for improving test processes on the Python interpreter past Brett Cannon, and (to my shock) he was really open to them. More on that soon.

That evening, I got a chance to meet up with R. Steven Rainwater ("robogato") and his wife Susan; Steven has taken over advogato. They took me out to a nice sushi place, which was really welcome after the heavier food I'd been eating thus far. More on that anon.

Post-PyCon: travelling to San Antonio

After PyCon, Diane Trout and I shuffled ourselves over to UTSA to talk with the nice people at the Computational Biology Initiative. The CBI is interested in making a commitment to future development of Cartwheel which is pretty cool. More anon.

--titus

posted at: 10:59 | path: /feb-07 | 0 comments

Tags: ,


Sun, 25 Feb 2007

Any advogatoans in Dallas want to get together?


Robogato and I are probably going to have dinner tonight here in Addison at ~6pm. Drop me a line if you're interested.

--titus

posted at: 10:15 | path: /feb-07 | 0 comments


Fri, 23 Feb 2007

PyCon '07: The hard part is over!


I have survived both our Testing Tutorial and the PyCon Web Panel! These were the things I was worried about... Now all I have to do is sit around and answer random questions (for the Testing Tools panel on Saturday) and give a talk on my various testing projects on Sunday.

Huzzah!

--titus

posted at: 13:58 | path: /feb-07 | 1 comments

Tags: ,


Wed, 21 Feb 2007

PyCon travel is cursed!


Last year, Grig and I flew down to Dallas together to attend PyCon. I don't recall the exact problem, but for some reason the plane didn't go and we ended up flying through Houston and missing our original flight to Dallas. (We did have some unreasonably yummy BBQ in Houston, so that was an unexpected plus.)

The curse has continued.

Right now, I'm sitting in an airport, using Boingo to connect to the Internet (an excellent $30/mo expenditure, thus far). My first plane was cancelled due to mechanical malf (an America West gig); I didn't make it off of standby for my second plane; and now, barring oddities, I'm definitely booked on to a 3:30 flight.

Better luck to the rest of you...

Is anyone reading this who wants to share a taxi ride at ~8:30pm over to the Dallas Marriott Quorum from DFW? E-mail me by 2:30pm Pacific time... ;)

--titus

posted at: 12:06 | path: /feb-07 | 1 comments

Tags: ,