Pubwication of software papers, and authorship on them

Pubwication. Pubwication is what bwings us togethew today. Pubwication, that bwessed awwangement, that dweam within a dweam. And authorship, twue authorship, wiww fowwow you fowevah and evah. So tweasuwe youw authorship.

Last week, our software paper on khmer 2.0 was published on F1000Research. We intend this paper to be a citation marker, but it also represents and recognizes some significant software engineering work done between khmer 1.x and khmer 2.0.

As part of the paper process, we offered authorship to everyone who has contributed to the khmer git repository - anyone who contributed to the repo was invited to sign on to the paper.

Addendum: I would like to credit Michael Crusoe with the initial suggestion to offer authorship to all git committers. This is in no way backing away from my own support for this decision, but I only realized a few days after writing the post that I had failed to properly credit Michael. So, kudos, Michael!

This policy has caused some consternation amongst the Twitterati, some (well, ok, one) of whom recoiled in horror at our author list, pointing at the recommendations of (e.g.) the International Committee of Medical Journal Editors. These recommendations are that authorship be based on (1) design and/or analysis, AND (2) writing, AND (3) final approval, AND (4) accountability. While the third and fourth criteria were met by all of the authors of this paper, the first and second were probably not met by all authors.

A few points are in order:

  • our condition for authorship is explicit, verifiable, and transparent. You can look up contributors in our release notes and GitHub repo to find out exactly what they did.

    We note that this clarity and verifiability is in contrast to most authorship.

  • authorship has been getting more complicated, and traditional authorship roles are both ill defined and clearly inadequate for modern research. There is an ongoing effort to define authorship roles more clearly and explicitly, and, coincidentally, GigaScience just announced they're signing on to this.

    For those who are curious, the majority of authors on our F1000Research paper fall under #3 on the CRediT taxonomy.

    (We did not talk with F1000Research about their support for this taxonomy in advance.)

  • our project is an open source project that is developed by a community, with contribution requirements and an extensive process for contributing. Our author list is an explicit acknowledgement of the role that the community has played in developing khmer, and the work that each and every contributor invested in our process.

  • assuming this paper passes into the peer reviewed literature, there may be some interesting consequences of our authorship criteria. For example, since most formal definitions of Conflict of Interest include shared authorship, Jared Simpson and Lex Nederbragt and I would now be in conflict and I would be unable to review their grants or papers. This seems silly to me!

Speaking with my senior author hat on, I hope it's clear that we are not trying to mock authorship in any way, and this is a serious publication on a serious project.

That having been said, we are trying out something new - in particular, we would like to figure out how to acknowledge software authorship within the scientific literature, both because it's the right thing to do AND because we'd like to incentivize community development of software. This is part of an ongoing discussion about the changing roles of contributorship in research (see first point, above).

On names

Some note has been made of the presence of what people presume to be pseudonyms in our author list. There is a long history in science of choosing a specific name or pseudonym to publish under; see Student, for one example (ht @rgcjk) of very, very many. We support this tradition.

Respecting peoples' chosen names is also important for many other reasons. I suggest people read through the Nymwars Wikipedia page, and pay special attention to the "criticism" section, which raises ethical, moral, and legal reasons why a "real names" policy is problematic.

Please note that there is a special place in hell reserved for people who attempt to deanonymize someone's pseudonym on a whim; this is both unprofessional and potentially harmful to the individual in question. Yes, I'm talking to you, Lior.


p.s. Please comment responsibly! On this post and all future posts, I am going to follow the Captain Awkward comments policy - specifically, "...sometimes comments don't show up because I delete them. This is a dictatorship, and I can delete any comment at any time for any reason."

Comments !