I just finished a day at the SIAM CSE 2019 conference, where I gave a talk
as part of a mini-symposium on software sustainability (my abstract,
and my talk slides; see the 'cpr' tag for all my recent blog posts on this topic.)
When I was outlining the talk, I spent a fair amount of time noodling
about how I wanted to approach the subject. I have a lot of
disorganized thoughts that I think can be put together in interesting ways,
but for a 20 minute talk, I really needed to pick a narrow focus.
Here's what I ended up with. I'm curious for reactions!
Defining a term, "communities of effort"
I'll start by defining "communities of effort" as a community formed in
pursuit of a common goal. The goal can be definite or indefinite in
time, and may not be clearly defined, but it's something that (generally
speaking) the community is aligned on.
The term "effort" here refers to focused or engaged attention,
and in this sense in particular, I mean the focused attention applied
towards the common goal.
One rational goal of such a community is to achieve the goal without
wasting effort through duplication or redundancy in work. This connects with
my earlier blog post on the open source anti-Sisyphean League, a
term coined by Cory Doctorow: the idea is that there are a number of
rocks to be rolled up hills, and (in an open community) there is no
reason for people to roll those rocks up the hill independently, since
they can take advantage of each other's efforts.
This community of effort directs itself towards achieving the goal,
applying the available effort to the task. Here, effort is a finite
resource that is consumable - you cannot apply the same effort to more
than one task, and the effort that is applied towards one task is not
available to be applied to another task. (Of course, the available
effort can be renewed or increased - more on that later.)
Effort as a common pool resource
The trickiest and most uncertain link is this: I think that the effort
applied towards the common goal is, to some extent, directed by the
community. That is, the available effort - which consists of work
by individuals towards the collective goal - is at the very least
loosely coordinated with the community, if not coordinated more closely.
(This may be because the community needs to be involved in order to
decrease redundancy. Not sure.)
If this is true - that effort is coordinated by the community rather
than the individual, and so is non-excludable, and also is a finite
resource that can be consumed, and is thus rivalrous, this turns
it into a common pool resource.
Common pool resources are well known to anyone who has heard of the
tragedy of the commons: they are resources that are subject to this
tragedy, of being consumable by many in an unregulated way.
What are some examples of these "communities of effort"?
A prime example is open source projects like Python. They're rooted
in a community approach; they're not not run by a corporation or a
government agency; and any structure (like a nonprofit) is created
after they already exist (and usually after they are successful!)
I think the Carpentries training community is another good example.
This is a community of people interested in teaching and training in
data science and software engineering that essentially self-assembled,
and is aligned around their mission (of teaching and training). The
non-profit structure around it is, again, an ex post facto creation.
Data analysis commons, in which methods, data, compute resources, and
data analysis interfaces are coordinated to address the data analysis
needs of a community, would be another example.
(Wikipedia might be another, but I'm less familiar with how it works.)
Why do we care about these communities?
Well, these communities are amazingly effective, in at least
some cases. For example, Python and R between them are essentially
the modern data science languages - both are open source, both
are community coordinated.
More generally, it is probably not an exaggeration to say that the
products of open source communities of effort underly the vast
majority of Silicon Valley software, as well as most research software.
Sustaining, growing, and supporting these communities is pretty
How do these communities get started, and why are they effective?
One feature of successful communities of effort - those that seem to
succeed in growing their pool of available effort - is that they are
often very organic in their approach to tackling their mission. This
is probably an effect of the community-based approach, in that the
members of these communities are to a reasonably significant extent
self-motivated and self-directed to solve their problems, and so the
solutions are often bottom-up created with only a light level of
coordination on top. (I'll revisit this in terms of governance in a
The other kind of fun thing is that these days it's pretty easy to bootstrap
a community of effort: with some enthusiasm and a site like GitHub, you can
spin up a new community project quite quickly.
Last but not least, many (most? all?) communities of effort have at
least one person who has placed their effort at the service of the
community mission. These are the leaders and/or maintainers of the
So what's the problem? It's all good, right?
Well... there are a few things I don't really understand.
For one, the formation of large groups of people who sustain a collective
to pursue a common goal violates basic tenets of collective action - at least,
as I understand them. The idea here is that, if there is a large group
of people pursuing a common goal, then the smart (economically rational) thing
for someone to do is ...not do any work at all, because the individual will
reap the benefits of the group work. So, what's different with these
communities of effort?
Sustainability and in particular maintenance is a big question, too;
these communities often rely on one or a few core maintainers to make things
happen, and it is really unclear why these maintainers (who are often
unpaid or underpaid) would take on these tasks. Yes, they get kudos and
reputation, but kudos and reputation do not put food on the table... why do
they do it?
(One thought - perhaps the creation of a successful community
of effort really depends on there being at least one person who ignores
short-term economic rationality? So then you just don't see all the failed
attempts where someone decides not to be irrational and hence not bother?
Another thought is that perhaps the key aspect of many of these communities
being open means that the maintainer-type folk realize that
no one else is tackling the common goal, and since they need the goal met
as well, they might as well do it?)
Does framing the problem as a common pool resource problem yield any solutions?
I think it does.
First, once you recognize effort as the limiting resource, the
question of how to maintain and increase that resource comes to the
forefront. There are a number of possible mechanisms, including
investing in making the community easy or rewarding to join, welcoming
new contributors, and/or providing special methods or data or access
to community members. In this view, these activities become more central
than they are if you are thinking only about the overall goal or mission
of the community.
Second, Elinor Ostrom outlined some design principles for
sustainability of common pool resources based on empirical studies, in
Governing the Commons. One of these principles is about making
collective choice arrangements that allow most of the appropriators
(members of the community) to participate in the decision making
Basically, this boils down to rewarding people who invest effort with
some level of influence in how that effort is applied towards the
community goals. This both incentivizes participation with collective
ownership, and also seems to allow a form of organic communication
where the people applying effort feed results from their work back
into the overall community direction. This is, to my mind, one of the
things that leads these communities to be so effective.
This mode of governance by members of the community for the
community goal leads to another interesting thought. Funders
participate in these communities in indirect ways, by seeking to fund
(or being sought out to fund) effort within the community. Rarely is
the direction of this support directly dictated by the funder; it's
usually laundered through the community member(s) being supported.
This is both good and bad - it limits the degree to which funders (and
companies) can directly influence the project, but also means that
funders may not be able to easily identify the uses to which their
money will be put.
Who is part of the community of effort?
Anyone who contributes their effort is part of the community, and hence
should get some form of influence over governance (by the above design
Extractive contributors - contributors who do not contribute to the
overall effort, especially the maintenance effort - would not,
however, be considered part of the community. See
How open is too open? for this argument.
People who are using the product of the community but not costing the
community any effort (e.g. consumers of the source code) would also
not be part of the community, unless they contribute in some way to the
One interesting result of this kind of thinking is that, for data
analysis commons, people who provide data or methods, or training
people, or contributing documentations, are contributing effort. This
provides a level of rational inclusion of this kind of work within the
community, and also in governance; they are in a direct sense
contributing to the sustainability of the community of effort.
Is academia a good home for these communities of effort?
I note that the leadership and governance model in basic research, at
least, is often not inclusive of the people who are doing the work,
and instead centers on reputation and hierarchy. I don't think
universities and colleagues focused on basic research are likely to be
a good part of the support network for communities of effort, in
I have been quite impressed with what I've seen of extension efforts at
universities, which are faculty-level investments of time and energy
in communities. I'm planning to look more in to the idea of a
digital extension model.
Some final thoughts
I think it's important to recognize that (these days at least) there
are lots of competing projects in which people can invest their time
and effort, and it's probably not a bad thing to frame it as a
competition between these communities for people's time and
attention. Communities that do a good job of attracting contributors
and incentivizing the inclusion of effort can win out and potentially
be more sustainable than communities that do a lousy job. (This has
potentially dire implications for some scientific research communities,
which are not always very welcoming or inclusive. I'm not sad about this.)
This framing also puts soft skills front and center in the equation,
and I think this is also a good outcome.
Open / unaddressed questions
Two open questions.
First, what are communities of effort not good at? I would venture
that any boring or maintenance level jobs would tend to be addressed
poorly by these communities, due to how human enthusiasm works.
Second, I want to return to the missing link mentioned above - that
these communities seemingly depend on one or more people placing their
effort in service of the community. What are the reasons why people do
this, and how do we support and maintain it? Inquiring minds want to
know... It would be nice if we had a reasonably comprehensive picture
of why this occurred, because it doesn't seem like rational behavior
on the face of it. (I'm very thankful that people do this, of course,
which is why I want to better support this path!)
I gratefully acknowledge Adam Resnick, Matter Trunnel, Josh Greenberg,
Nadia Eghbal, Luiz Irber, and Tracy Teal, with whom I've had inspiring
conversations on these fronts.
The NIH (via the Data Commons funding) and the Moore Foundation
provided funding to me to think about, read about, and explore these
There are comments.