Openness and online reputation recognized in grant reviews!

For each of the last two summers, I've returned from co-teaching our Analyzing Next-Generation Sequencing Data course, slept for 48 hours straight, and then hunkered down and bunkered up to write grants. (To be clear, sometimes this bunkering up involves travelling out to California and sitting on my in-laws' beach, so the environment is not always very bunkerish.) Last year this resulted in a bunch of grant proposals, including a resub NSF CAREER, a resub NIH R25 for the summer course, a Cyberinfrastructure Education supplement (funded!), an ill-fated Moore Foundation proposal, an NSF IGERT-CIF21 training grant proposal, and an NSF/NIH BIGDATA proposal.

Digression: just in case you are thinking of trying to write upwards of 8 grants in a 6 month period? Don't. It's not fun, even if you're on a beach some of the time.

The BIGDATA proposal (oddly termed a "Small BIGDATA proposal"...) was really a shot in the dark; it proposed to build on our digital normalization algorithm to implement a variety of low-memory streaming prefilters for sequencing data. I included a lot of preliminary data and tried to build my usual connections between good computer science, good computing, good software engineering, and good open science. I had no idea if this aspect of the grant would be well received.

Most of these interests have not been encouraged, outside of the blogosphere and tweetstream. Administrators at MSU have ranged from offering amused and only slightly condescending comments ("well, it probably won't hurt much, but is it really that important!?") to outright discouragement or disbelief ("waste of time and energy"). Many of my fellow faculty have realized that I'm passionate and loud about it and have at least stopped arguing that it's a waste of my time.

Well, I just got reviews back for the BIGDATA grant, and -- while I still don't know if it's going to be funded -- all of the reviews specifically and positively mentioned my "side" activities in open source, scientific software engineering and testing, and open science/open data; see the excerpts below. I'm particularly happy with the reviewer referenced our diginorm paper on arXiv, because it lends weight to the argument for preprints.

Reviewer 1:

"The PI is acknowledged as a leader in the field in promoting open-source code and rigorous software development processes."

"PI Brown takes a very open-source community-driven approach to software development and has a first rate reputation."

"Because the authors are clearly committed to the widest possible distribution of their software, their approach is likely to have substantial impact."

"This application points to a long and successful track-record and experience in following rigorous but open software development processes."

"As noted above, the PI clearly understands how to rigorously test software with automated test harnesses and a proper source-controlled software development environment. In addition, because the PI emphasizes community input into the work, the developed libraries will likely be tested and refined by many users."

"The use of IPython notebooks to share execution environments is particularly intriguing. Software will be shared under an open-source model, which is entirely appropriate for this type of software library work that will serve as a community resource. The github "pull" request model will support automatic patching and further strengthens the software sharing plan."

Reviewer 2:

"Laboratory employs effective software development methods. P.I. is interested in and has a track record of developing usable software."

"The proposers are appropriately careful about use of source control and handling of software and source code. The software sharing plan is appropriate."

Reviewer 3:

"I was aware of the basic methods of this proposal, having previously read the PI's arXiv manuscript."

"The PI is a strong advocate and leader for improved data sharing and computational reproducibility. Software sharing plan is excellent. The PI demonstrates a commitment to open software."


And, finally, one entertaining comment --

"As the PI points out, lack of infinite computational resources has actually been a motivator of the proposed path to make genome analysis more efficient in memory and time. The environment at MSU, however, seems more than adequate for the proposed work."

(I'm not sure who does have infinite computational resources ;)


Coming in the face of a long history of indirect experience of scorn for these workmanlike details in grants (for a good time, go ask Fernando Perez about his NSF review panel experiences! stand a few feet away, though...), I'd love to argue that these comments indicate a sea change in review panels, and that computational biologists should now be on notice that good and open software development is a critical component of grants. That's probably not the case: the BIGDATA call was very specifically about real software. But certainly it's a good sign, and -- assuming we get funded -- it will strengthen my arguments that this kind of stuff really matters for important things like getting money.

Oh, and a plug -- we teach the kind of stuff mentioned in the comments above in Software Carpentry, and we'll come give a workshop at your institution for the cost of travel. Check it out.


Comments !