Open Science, and Risk/Benefit Analysis

In thinking about open science and open communication about science, I've always been frustrated by the people who claim that the risks outweight the benefit. Their arguments seem sound if you buy into a certain kind of logic (the creationists will try to twist whatever you say! the climate change deniers will use your words in ways you did not intend! people will steal your research! you cannot communicate openly about what you're doing!) but I could never pin down why I felt that way. I had a eureka moment about it today, though.

When someone tells me that (for example) we should not make all BEACON research proposals fully public because they will be misinterpreted by creationists and used against us, they are saying this: in their personal opinion, the identified risks outweight the identified benefits. They already know (and I agree that this will happen) that people will take the BEACON-funded study of -- for example -- some fascinating tailless ascidians as a scientific boondoggle, an excuse for a trip to France that won't result in anything but more incomprehensible literature about chordate origins. And they can't imagine that, without careful shaping of the message and management of the public image, this will not happen. Since there's no particularly obvious benefit to posting them publicly, the risks (of misinterpretation) outweigh the benefits (of some nebulous "open science" thingy). So halt! the publication.

Same arguments apply to climate change (but they'll just misuse/misinterpret the data!) and open science in general (but someone will just steal my data/ ideas/...!)

This is fundamentally a failure of imagination. It is doing a risk analysis based on your worst fears, and neglecting a benefit analysis of your wildest hopes.

For examples:

In the case of BEACON, we have a sprawling collection of 100 faculty spread across 5 institutions. I have literally no idea what more than half of them are doing. Wouldn't it be great if I could do a text search of their proposals, and even better if I could stumble across a BEACON colleague in a Google search on some topic or other? Or if we could attract students that didn't even know they were interested in "evolution in action", but came to our Web site based on Google's indexing of a rich array of research projects and then found themselves hooked?

What about the climate change skeptic (or agnostic) who suddenly gets a chance to sit down and look at all the data and can conclude that hey, this is actually really complicated? And it's probably not as simple as the skeptics claim? (Aside: I'm unbelievably pissed at the climate change community for the idiocy of their current closed-ness.)

And what about the collaborators that I could get (and am getting) from posting about some of our projects? In the worst case, I post about things and no one pays any attention; in the best case that I can think of, I make connections and establish cred that enables future collaborations, publications, and grant opportunities. (This is already happening.)

At the heart of science is an ethos that has to include openness in order to work properly. Any constriction in the flow of ideas and the interchange of opinions is a block in the very lifeblood of science itself. If we indulge those who argue against free communication, we are preventing not only some imagined negative consequences, but all of the happy coincidences that are beyond our limited imaginations.

So turn on, tune in, and don't drop out.


Legacy Comments

Posted by Keron Greene on 2010-09-04 at 12:36.

While I agree that science, at its core, relies (and thrives) on a
high level of openness, there is a serious case to be made for
limiting the amount of publicly available detail on new proposals or
in-progress projects.  It would be like giving away the secret recipe
before you publish the cookbook. Isn't it better instead to let
everyone know that you are working on [insert important principle
here], full details to follow?

Posted by Titus Brown on 2010-09-04 at 15:24.

Hey Keron,    is the key component underlying that argument the issue
of who gets credit for the work?  e.g. Premature discussion of
[important principle] would let others replicate the research and
scoop some or all of it before you were ready to publish?  That's
certainly an issue (and it helps highlight the degree to which current
peer review and publication practices actually inhibit science -- lots
of discussion of that going on elsewhere) but I'd like to offer two
countervailing considerations.    First, it's generally pretty hard to
get anyone to pay attention to your work in the first place, even when
it's nicely packaged and written up.  There's some fairly grim
statistics about how many research papers sink without a trace... if
you're convinced that what you're working on is earth shattering or at
least something that's likely to be "stolen", then by all means
reserve some of the data or some of the tools.    I have to say if
that were where people were drawing the line on disclosure, I'd
understand it much better.  But when I talk to people, that's not
their concern -- or, when it is, it's a fear rather than a concern
grounded in experience.  (See above blog post.)    Second, you always
have to be wary of half- or 3/4-baked research, even by good labs.
You could try to use any of the code I've checked in to khmer over the
last few weeks to do some pretty cool things, but it turns out that
some of that code was ... wrong.  (I know, shocking, right??)  It's
also not well documented, despite some serious efforts on my part, and
it's certainly not going to be easy to take it and use it to jump-
start your own research in this area.  People generally talk about
their sexy ongoing research at conferences, but then discover problems
with their data between then and publication.  How is that any
different from talking about it in public?    Never mind that it's
unethical to swipe people's data and code without attribution... which
poses practical challenges.  If someone swipes hyena data from a
BEACON lab, how on earth are they going to write the methods section
honestly, and how are they not going to get caught!?    Overall, I
think people who talk about giving away the secret ingredient in their
research are falling prey to exactly what I say above: focusing on
their fears rather than their hopes.    There's also an interesting
question about how important and novel your research is... if it's
enough for you to say "I'm going to do X to Y!" and immediately
someone else can steal this idea and do all the work and receive all
the credit, then I would submit that it's probably not a terribly
groundbreaking idea.  Or, alternatively, it's groundbreaking but other
people can do it -- and I'd prefer to move on to stuff that only I can
do.  Personally I find that ideas are cheap, while real work and
actual results are expensive, and I don't know many scientists that
would disagree with that.    Anyway, I'd be much more impressed if we
were actually discussing **how much** climate data to release.  Since
instead we're discussing whether **any** of the disputed climate data
should be publicly released, I'm going to feel free to continue to be
grumpy about it...    --titus

Posted by Erich Schwarz on 2010-09-05 at 19:18.

With climate science, we've got the following combination:    1.
Scientists who want their computer models of the year 2110 to be used
to justify cutting the income of the first world by tens of trillions
of dollars.    2. Scientists who get really, really squirrely if you
ask them to open-source their <i>prediction software</i>, let alone
their raw data.    3. Scientists who think not merely that they're
generally correct, but that they're practically infallible and have
nothing to learn from opening up their predictions to external
"unqualified" critics such as Freeman Dyson.    Doomed, stupid,
narcissistic.    Open-source isn't just a good idea, it's the only
hope the climate scientists have of being taken seriously again.  I
suspect people just entering the field -- people in the early 20s, for
whom open-source is an established success in other areas of
computation -- are likely to understand that tolerably well.  However,
having talked one-on-one with Benjamin Santer about open-source, and
gotten an underwhelming response, I very much doubt the current
luminaries in the field will ever understand it.

Comments !