Here is a links roundup and some scattered thoughts on the recent
meeting on "the next five years of data science at the NIH"; this
meeting was hosted by Phil Bourne, the new Associate Director
for Data Science at the NIH.
If you're into metagenomics, you may have heard of CAMI, the Critical
Assessment of Metagenome Interpretation. I've spoken to several people about
it in varying amounts of detail, and it seems like the CAMI group is
working to generate some new shotgun metagenome data sets and will
This is a recipe that provides a time- and memory- efficient way to
loosely estimate the likely size of your assembled genome or
metagenome from the raw reads alone. It does so by using digital
normalization to assess the size of the coverage-saturated de Bruijn
assembly graph given the reads ...
This recipe provides a time-efficient way to determine whether you've
saturated your sequencing depth, i.e. how much new information is
likely to arrive with your next set of sequencing reads.
It does so by using digital normalization to generate a "collector's
curve" of information collection.