Thoughts on goals and tradeoffs in classifying shotgun metagenome data.
read moreThere are comments.
This is our just-submitted proposal for the JGI-NERSC "Facilities Integrating Collaborations for User Science" call. Enjoy!
Abstract: Sourmash is a command-line tool and Python library that calculates and compares MinHash signatures from sequence data. Sourmash "compare" and "gather" functionality enables comparison and characterization of signatures …
read moreThere are comments.
This is another blog post on MinHash sketches; see also:
There are comments.
One of the uses that we are most interested in MinHash sketches for is the indexing and search of large public, semi-public, and private databases. There are many specific use cases for this, but the basic goal is to be able to find data sets by content queries, using sequence …
read moreThere are comments.
This is an update to last week's blog post, "Efficiently searching MinHash Sketch collections".
Last week, Thanksgiving travel and post-turkey somnolescence gave me some time to work more with our combined MinHash/SBT implementation. One of the main things the last post contained was a collection of MinHash signatures of …
read moreThere are comments.
There is an update to this blog post: please see "Quickly searching all the microbial genomes, mark 2 - now with archaea, phage, fungi, and protists!
Note: This blog post is based largely on work done by Luiz Irber. Camille Scott, Luiz Irber, Lisa Cohen, and Russell Neches all collaborated on …
read moreThere are comments.