Published: Thu 02 April 2015
By C. Titus Brown
In science .
tags: data open data
A few weeks back, a journalist contacted me about my old blog post
comparing physics and biology ,
and amidst other conversation, I pointed them at my latest blog post
on data
and said that I thought a lot of (molecular) biologists were
"culturally confused about data". The next question was, perhaps
obviously, "what do you mean by that?" and I wrote in response:
... for molecular biologists, "data" is what they collect piece
by piece with PCR, qPCR, clone sequencing, perturbation experiments,
image observations. It is so individual and specialized to a problem that to
share it prior to publication makes no sense; no one could understand it all
unless it were fit into a narrative as part of a pub, and the only useful
product of the data is the publication; access to the data is only useful
for verifying that it wasn't manufactured.
Sequencing data was one of the first outputs (as opposed to things like
reagents, antibodies and QPCR primers) that was useful beyond a particular
narrative. I might sequence a gene because I want to knock it down and
need to know its leader sequence for that, but then you might care about
its exonic structure for evolutionary reasons, and Phil over there might
be really interested in its protein domains, while Kim might be looking at
an allele of that gene that is only in part of the population.
I'm probably overstating that distinction but it helps explain a LOT of
what I've seen in terms of cultural differences between my grad/pd labs
(straight up bio) and where I think bio is going.
I'm sure I'm wrong (certainly incomplete) about lots of this, but it does
fit my own personal observations. Other perspectives welcome!
I decided to write this up as a blog post because I read Carly
Strasser's excellent blog post introducing open science ,
which emphasizes data, and it made me think about my response above.
I think it's interesting to think about how "data" can be interpreted
by different fields, and I'd like to stress how important it is that
we bridge the gap between these high-level views and day-to-day
practice in each subdomain - the culture and language can vary so
significantly between even neighboring fields!
Oh, and Carly Strasser is now one of the Moore Data Driven Discovery
Initiative Program Officers - I'm really
happy to see the Moore Foundation confronting these aspects of data head
on by hiring someone with Carly's experience and expertise, and I look
forward to interacting with her more on these issues!
--titus
There are comments .