Do software and data products advance biology more than papers?

There are many outputs from our lab and our collaborators - off the top of my head, the big ones are:

  • papers and preprints
  • software
  • data sets
  • blog posts and tweets
  • talk slides and videos
  • grant proposal text
  • training materials and tutorials
  • trainees (core lab members, rotation students, people who attend our workshops, etc)

Traditionally, only the first (papers) and some small part of the last (trainees who get a PhD or do a postdoc in the lab) are explicitly recognized in biology as "products". I personally value all of them to some degree.

In terms of actual effect I believe that software, trainees, blog posts, and training materials are more impactful products than our papers.

In terms of taming the chaos of science, I view advances in our software's capabilities, and the development and evolution of our perspectives on data analysis, as a kind of ratchet that inexorably advances our science.

Papers, unless they accomplish the very difficult task of nailing down a concept and explaining it well, do very little to advance our lab's science. They are merely artifacts that we produce because they meet metrics, with the side effect of being one relatively ineffective way to communicate methods and results.

A question that I've been considering is this:

To what extent is the focus on papers as a primary output in biology (or at least genomics and bioinformatics) skewing our field's perspectives and slowing progress by distracting us from more useful outputs?

A companion question:

How (if at all) is the rise of software and data products as putative equivalents to papers leading to epistemic confusion as to what constitutes actual progress in biology?

To explain this last point a bit more,

it's not clear that many papers really advance biology directly, given the flood of papers and results and the resulting loss of ability to read and comprehend them all in a particular subject. (This is more true in some areas than in others, but you could also argue that big fields are maybe getting subdivided into more narrow fields because of our inability to comprehend the results in big fields.)

More and more, the results of papers need to be incorporated into theory (difficult in bio) or databases and software before they become useful in biology.

From this perspective, good data and software papers actually advance biology more than a specific finding.

I don't think this is entirely right but I feel like the field is trending in this direction.

But most senior people are really focused on papers as outputs and ignore software and data. This makes it hard for me to talk to them sometimes.

Ultimately, of course, insight and cures, for lack of a better word, are the rightful end products of basic research and biomedical science, respectively. So the question is how to get there faster.

Are papers the best way? Probably not.

Some side notes

I've been pretty happy with the way UC Davis handles merit and promotion, in that faculty in my department really get to explain what they're doing and why. It's not all about papers here, although of course for research-intensive profs that's still a major component.


This blog post was greatly inspired by conversations with Becca Calisi-Rodriguez and Tracy Teal, as well as (as always) the members of the DIB Lab. Thanks!! (I'm not implying that they agree with me, of course!)

I'm particularly indebted to Dr. Tamer Mansour, who, a year ago, said (paraphrased): "This lab is not a research lab. Mostly we train people, and do software engineering. Research is a distinct third." I disagree but it sure was hard to figure out why :)


Comments !