One of our long-term interests has been in figuring out what the
!$!$!#!#%! assemblers actually do to real data, given all their
heuristics. A continuing challenge in this space is that short-read
assemblers deal with really large amounts of noisy data, and it can
be extremely hard to look at assembly …
So far, in this week of khmer blog posts (1, 2, 3), we've been
focusing on the read-to-graph aligner ("graphalign"), which enables
sequence alignments to a De Bruijn graph. One persistent challenge
with this functionality as introduced is that our De Bruijn graphs
nodes are anonymous, so we have no …
De Bruijn graph alignment should
also be useful for exploring concepts in transcriptomics/mRNAseq
expression. As with variant calling
graphalign can also be used to avoid the mapping step in
quantification; and, again, as with the variant calling approach, we
can do so by aligning our reference sequences to the …
There's an interesting and intuitive connection between error
correction and
variant calling - if you can do one well, it lets you do (parts of)
the other well. In the previous blog post on
some new features in khmer, we introduced our new "graphalign"
functionality, that lets us align short sequences …
One of the newer features in khmer that we're pretty excited about is
the read-to-graph aligner, which gives us a way to align sequences to
a De Bruijn graph; our nickname for it is "graphalign."
Briefly, graphalign uses a pair-HMM to align a sequence to a k-mer
graph (aka De …