Sun 31 August 2014
C. Titus Brown
As I mentioned, I am hoping
to significantly scale up my training efforts at UC Davis; it's one of
the reasons they hired me, it's a big need in biology, and I'm
enthusiastic about the whole thing! A key point is that, at least at
the beginning, it may replace some or all of my for-credit teaching.
(Note that the first four years of Analyzing Next-Generation
counted as outreach, not teaching, at MSU.)
I don't expect to fully spool up before fall 2015, but I wanted to
start outlining my thoughts.
The ideas below came in large part from conversations with Tracy Teal,
a Software Carpentry instructor who is one of the people driving Data
Carpentry, and who also was one of the
EDAMAME course instructors.
How much training, how often, and to whom?
I think my initial training efforts will center on
Carpentry-style workshops, on a variety
of (largely bio-specific) topics. These would be two-day in-person
workshops, 9-5am, each focused on a specific topic.
I think I can sustainably lead one a month, with perhaps a few months
where I organize two in the same week (M/Tu and Th/Fri, perhaps).
These would be on top of at least one NGS course a year, too. I also
expect I will participate in various
Genome Center training workshops.
The classes would be targeted at grad students, postdocs, and faculty
-- same as the current NGS course. I would give attendees from VetMed
some priority, followed by attendees with UC Davis affiliations, and
then open to anyone. I imagine doing this in a tiered way, so that
some outsiders could always come; variety and a mixed audience are
On what topics?
I have a laundry list of ideas, but I'm not sure what to start with or
how to make decisions about what to teach when. ...suggestions welcome.
(I also can't teach all of these myself, but I want to get the list of
I'd like to preface this list with a few comments: I've been teaching
and training in these topics for five years (at least) now, so I'm
not naive about how hard (or easy) it is to teach this to computationally
inexperienced biologists. It's clear that there's a progression of skills
that need to be taught for most of these, as well as a need for careful
lesson planning, tutorial design, and pre/post assessment. These workshops
would also be but one arrow in the quiver -- I have many other efforts
that contribute to my lab's teaching and training.
With that having been said, here's a list of general things I'd like to
Shell and UNIX (long running commands, remote commands, file and path management)
Scripting and automation (writing scripts, make, etc.)
Bioinformatics and algorithms
"Big data" statistics
Data integration for sequencing data
Software engineering (testing, version control, code review, etc.) on the open source model
Practical bioinformatics (See topics below)
Modeling and simulations
Workflows and replication tracking
I have many specific topics that I think people know they want to learn:
Mapping and variant calling
Genome assembly and evaluation (microbial & large genomes both)
Transcriptome assembly and evaluation (reference free & reference based)
Differential expression analysis
Microbial ecology and 16s approaches
Functional inference (pathway annotations)
Genotyping by sequencing
And finally, here are two shorter workshop ideas that I find
particularly neat: experimental design (from sample prep through
validation), and sequencing case studies (success and failure
stories). In the former, I would get together a panel of two or three
people to talk through the issues involved in doing a particular
experiment, with the goal of helping them write a convincing grant For
the latter, I would find both success and failure stories and then
talk about what other approaches could have rescued the failures, as
well as what made the successful stories successful.
To what end? Community building and collaborations.
Once I started focusing in on NGS data at MSU as an assistant
professor, I quickly realized that I could spend all my time in
collaborations. I learned to say "no" fairly fast :). But all
those people still need to do data analysis. What to do? I had
no clear answer at MSU, but this was one reason I focused on
At Davis, I hope to limit my formal collaborations to research topics,
and concentrate on training everybody to deal with their own data; in
addition to being the only scalable approach, this is career-building
for them. This means not only investing in training, but trying to
build a community around the training topics. So I'd like to do
regular (weekly? fortnightly?) "help desk" afternoons for the campus,
where people can come talk about their issue du jour. Crucially, I
would limit this to people that have gone through some amount of
training - hopefully both incentivizing people to do the training, and
making sure that some minimal level of effort has been applied. The
goal would be to move towards a self-sustaining community of people
working in bioinformatic data analysis across multiple levels.
Cost and materials.
Since UCD VetMed is generously supporting my salary, I am naively
expecting to charge nothing more than a nominal fee -- something that
would discourage people from frivolously signing up or canceling.
Perhaps lunch money? (This might have to be modified for people from
outside of VetMed, or off-campus attendees.)
All materials would continue to be CC0 and openly available, of course.
'cause life's too short to limit the utility of materials.
I'd love to put together a slush fund so that I can invite out speakers
to run workshops on topics that I don't know that well (most of 'em).
How about a workshop focused on teaching people how to teach with the
materials we put together? (I would expect most of these workshops
to be cloud-based.)
p.s. In addition to Tracy, thanks to Keith Bradnam, Aaron Darling,
Matt MacManes and Ethan White, for their comments and critiques on a