Living in an Ivory Basement Stochastic thoughts on science, testing, and programming.

Do software and data products advance biology more than papers?

There are many outputs from our lab and our collaborators - off the top of my head, the big ones are:

papers and preprints
software
data sets
blog posts and tweets
talk slides and videos
grant proposal text
training materials and tutorials
trainees (core lab members, rotation students, people who attend our workshops, etc)

Traditionally, only the first (papers) and some small part of the last (trainees who get a PhD or do a postdoc in the lab) are explicitly recognized in biology as "products". I personally value all of them to some degree.

In terms of actual effect I believe that software, trainees, blog posts, and training materials are more impactful products than our papers.

In terms of taming the chaos of science, I view advances in our software's capabilities, and the development and evolution of our perspectives on data analysis, as a kind of ratchet that inexorably advances our science.

Papers, unless they accomplish the very difficult task of nailing down a concept and explaining it well, do very little to advance our lab's science. They are merely artifacts that we produce because they meet metrics, with the side effect of being one relatively ineffective way to communicate methods and results.

A question that I've been considering is this:

To what extent is the focus on papers as a primary output in biology (or at least genomics and bioinformatics) skewing our field's perspectives and slowing progress by distracting us from more useful outputs?

A companion question:

How (if at all) is the rise of software and data products as putative equivalents to papers leading to epistemic confusion as to what constitutes actual progress in biology?

To explain this last point a bit more,

it's not clear that many papers really advance biology directly, given the flood of papers and results and the resulting loss of ability to read and comprehend them all in a particular subject. (This is more true in some areas than in others, but you could also argue that big fields are maybe getting subdivided into more narrow fields because of our inability to comprehend the results in big fields.)

More and more, the results of papers need to be incorporated into theory (difficult in bio) or databases and software before they become useful in biology.

From this perspective, good data and software papers actually advance biology more than a specific finding.

I don't think this is entirely right but I feel like the field is trending in this direction.

But most senior people are really focused on papers as outputs and ignore software and data. This makes it hard for me to talk to them sometimes.

Ultimately, of course, insight and cures, for lack of a better word, are the rightful end products of basic research and biomedical science, respectively. So the question is how to get there faster.

Are papers the best way? Probably not.

Some side notes

I've been pretty happy with the way UC Davis handles merit and promotion, in that faculty in my department really get to explain what they're doing and why. It's not all about papers here, although of course for research-intensive profs that's still a major component.

Acknowledgements

This blog post was greatly inspired by conversations with Becca Calisi-Rodriguez and Tracy Teal, as well as (as always) the members of the DIB Lab. Thanks!! (I'm not implying that they agree with me, of course!)

I'm particularly indebted to Dr. Tamer Mansour, who, a year ago, said (paraphrased): "This lab is not a research lab. Mostly we train people, and do software engineering. Research is a distinct third." I disagree but it sure was hard to figure out why :)

--titus

There are comments.

Other articles

Cultural confusions about data - the intertidal zone between two styles of biology

Published: Thu 02 April 2015
By C. Titus Brown

In science.

tags: data open data

A few weeks back, a journalist contacted me about my old blog post comparing physics and biology, and amidst other conversation, I pointed them at my latest blog post on data and said that I thought a lot of (molecular) biologists were "culturally confused about data". The next question was …
read more
There are comments.
Building better metagenomics pipelines

Published: Tue 19 February 2013
By C. Titus Brown

In science.

tags: science data metagenomics

I spend so much of my time writing stuff down to cadge funding or bruit about ideas, and much of that never really goes anywhere. In the interests of slowing down any competitors by getting them to take my old ideas seriously, here is an interesting set of ideas that …
read more
There are comments.
My takeaways from a 2013 NAS Meeting on Heterogeneous Data Integration

Published: Fri 11 January 2013
By C. Titus Brown

In science.

tags: science data

I just left the NAS meeting on Integrating Environmental Health Data to Advance Discovery, where I was an invited speaker. It was a pretty interesting meeting, with presentations from speakers who worked on chemotoxicity data, pollution data, exposure data, and electronic health records, as well as a few "outsiders" from …
read more
There are comments.