Mon, 20 Aug 2007
SciPy 2007: Biology BoF
So, I "organized" a Biology Birds of a Feather at SciPy 2007. This mainly consisted of posting about it and then trying to write stuff on a white board while keeping abreast of the conversation. About 15 people attended.
I didn't get everyone's name and in any case I don't want to pin good/bad opinion labels on people ;). So this will be anonymous reporting!
Notes from the meeting were posted by two different people, and they are available from the biology-in-Python archives.
First things first: People who are interested in discussion their work in Python and biology should subscribe to the biology-in-Python mailing list. (After a number of negative comments from people about the first two discussions on the list, I will endeavor to be a bit of a moderator, so don't take past discussion as indicative of future ;)
Second, the biggest decision to come out of the BoF was to make an effort to build up a community presence with a bit of a Web site as well as things like tutorials, code links, discussion, etc. Brandon King has very kindly agreed to provide a basic Web site, and we'll probably start off by hosting everything at scipy.org. More on that when it happens.
Third, and I feel like this is a big enough issue that it's worth saying loudly and clearly, only one person in the room was positive about BioPython. Everyone else either had a bad opinion of it ("ugly", "non-Pythonic") or had been warned off by people with bad opinions of it -- and surprisingly it was dominated by the former and not the latter. To me this indicates that these feelings about BioPython are widely shared. I don't know that it's worth going into detail on why -- and we didn't cover it in depth at the BoF -- but it needs to be mentioned.
The general consensus was that we needed to get the BioPython guys involved in the biology-in-python mailing list, though, whether or not we wanted to use "their" code!
Fourth, there was general agreement that Python could solve a lot of problems for biology (big surprise there!) and that it could do so by providing next-generation solutions rather than simply providing a slightly nicer BioPerl-style interface. What this precisely means will have to be left to the imagination, but one experienced BioPerl user said that the type of stuff being done with pygr represented a real break with what he'd seen from bioinformatics previously.
At the same time, we all still need to parse, we still need to talk to big databases, and we still need to break down large problems. This suggests that there's room for at least common interfaces, if not necessarily One True Package. I hope to push on this area myself.
One person made a push for One True Package, but I argued that we had no Linus/Guido/Larry in the community. Perhaps we could go with a ring system like PostgreSQL, which seems to have no BDFL but instead a small group of sensible people who contribute? This might be an option for pushing "BioPython 3k" ;).
We will be setting up facilities to pimp other people's code, as well as places to discuss it, refine it, and help people build and test it. Even more interestingly, there was common agreement that doing something like hosting published results (+ code/source data) was a great idea. This is a second area where I hope to really push.
Not sure what else there is to say. Overall it was a good, albeit occasionally heated, discussion, and it was really good to meet everyone. Hopefully we can follow up with a PyCon BoF/Sprint where we can get more the people together in one room!
--titus
posted at: 00:57 | path: /aug-07 | 0 comments
The SciPy 2007 Testing BoF
Thanks to a kind invitation by Fernando Perez, I was alerted to a BoF on Python/testing at SciPy. He made the mistake of introducing me as "the resident expert" so I felt even less inhibited than normal, which was hopefully not too problematic...
Gael Varoquaux took notes.
Basically, this was a lot of fun. I can't compare with previous years, but I feel like testing in Python is really being emphasized. Fernando shared some of his personal reasons for getting so interested, which was interesting in itself. Most people in the room seemed interested in getting started if they weren't already testing (well, duh, it was a BoF, but still...)
A few comments/thoughts/anecdotes.
I need to publicize the testing-in-python mailing list some more. It's quite low bandwidth and the signal-to-noise ratio has been essentially infinite (no noise!). Join!
nose is becoming very popular. Three or four of the people there (out of 20 or so) were using nose already, and had nothing but good things to say about it. I think extending my nose Introduction would be very relevant to the community.
Fernando really wanted to work within the stdlib, but several people tried to convince him that nose was worth the extra install.
There were a lot of very good requests for testing functionality that doesn't yet exist (see the wiki notes). There are some pretty good master's projects in there, actually...
The most interesting suggestion came from several people: people would like to be able to tie test results (performance, regression, code coverage) to specific code revisions and then query for results across revisions. I think something like the Test Anything Protocol might fit the bill, although it may be too simple.... Anyway, someone should develop this. Since I believe svn stores diffs, it could be as simple as appending the latest test results to a file, although this could be really stupid for a big project ;).
I recommend that people interested in GUI testing look at QT and KWWidgets. These are both toolkits with test hooks built in. KWWidgets is not well known, but it seems deserving of further attention.
Nobody else knows how to do multiprocessing code tests, either.
There's clearly enough interest out there to support a few simple "intro" guides to testing in Python...
All in all, a really fun time -- thanks for organizing this, Fernando!
--titus
posted at: 00:57 | path: /aug-07 | 0 comments
SciPy '07: General Report
Last week, SciPy 2007 came to Caltech. Unfortunately, I wasn't able to attend many of the talks because I was busy with lab work and other deadlines, and because SciPy was held immediately upstairs from my lab I could just duck out to go back to work. However, I did attend a few of the talks and found all of them interesting -- and I heard that I seriously missed out on one or two. Hopefully other people will blog about those.
The talks I am qualified to comment on are the two other biology-based ones that I could attend. One was on pygr, given by Chris Lee, and the other was on Galaxy, given by James Taylor. Together with my talk on Cartwheel, I think they illustrated three entertainingly diverse approaches to bioinformatics and biology. Chris focused on addressing query paradigms (pygr implements a graphic database) as well as data size issues in alignment and annotation of large sequences. James talked a lot about building workflow interfaces to packages, thereby presenting a possible solution to many of the current problems in bioinformatics such as formatting, reproducibility, and software interaction. And I talked about how Cartwheel provided biologists with one specific way to answer one specific kind of problem, and did so in about as simple a manner as you can imagine. I already have plans for cross-fertilization...
Hopefully we can attract other bio talks to SciPy next year!
I also gave a rather low-key tutorial on "idiomatic Python", which went OK but was, well, low-key. I attended two birds-of-a-feather, one on testing (organized by fperez) and one on biology (organized by me). I'll write more about those later.
The BoFs and the after-talk interactions really convinced me that SciPy is a very useful conference for meeting other people who have similar problems to you (perhaps in slightly different fields). While I feel the talks are widely scattered by topic, ultimately a lot of the issues -- data handling, interfaces, visualization, parallelization -- are shared by everyone, and because we all use Python we can actually use each other's technology! Very cool stuff. Next year I'd suggest having more mixers and shorter presentations, so that we can get more of a sense of what's going on out there, but that's all I would suggest.
--titus
posted at: 00:22 | path: /aug-07 | 0 comments