Summary: | <p>Abstract</p> <p>Background</p> <p>Shotgun sequencing of environmental DNA is an essential technique for characterizing uncultivated microbes <it>in situ</it>. However, the taxonomic and functional assignment of the obtained sequence fragments remains a pressing problem.</p> <p>Results</p> <p>Existing algorithms are largely optimized for speed and coverage; in contrast, we present here a software framework that focuses on a restricted set of informative gene families, using Maximum Likelihood to assign these with the best possible accuracy. This framework ('MLTreeMap'; <url>http://mltreemap.org/</url>) uses raw nucleotide sequences as input, and includes hand-curated, extensible reference information.</p> <p>Conclusions</p> <p>We discuss how we validated our pipeline using complete genomes as well as simulated and actual environmental sequences.</p>
|