Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT

Abstract Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent...

Full description

Bibliographic Details
Main Authors: F. A. Bastiaan von Meijenfeldt, Ksenia Arkhipova, Diego D. Cambuy, Felipe H. Coutinho, Bas E. Dutilh
Format: Article
Language:English
Published: BMC 2019-10-01
Series:Genome Biology
Online Access:http://link.springer.com/article/10.1186/s13059-019-1817-x
_version_ 1818980741792399360
author F. A. Bastiaan von Meijenfeldt
Ksenia Arkhipova
Diego D. Cambuy
Felipe H. Coutinho
Bas E. Dutilh
author_facet F. A. Bastiaan von Meijenfeldt
Ksenia Arkhipova
Diego D. Cambuy
Felipe H. Coutinho
Bas E. Dutilh
author_sort F. A. Bastiaan von Meijenfeldt
collection DOAJ
description Abstract Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent novel deep lineages. We present a classification method that integrates multiple signals to classify sequences (Contig Annotation Tool, CAT) and metagenome-assembled genomes (Bin Annotation Tool, BAT). Classifications are automatically made at low taxonomic ranks if closely related organisms are present in the reference database and at higher ranks otherwise. The result is a high classification precision even for sequences from considerably unknown organisms.
first_indexed 2024-12-20T17:20:15Z
format Article
id doaj.art-e2f82306a2f447aca0d7da181b561e44
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-12-20T17:20:15Z
publishDate 2019-10-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-e2f82306a2f447aca0d7da181b561e442022-12-21T19:31:51ZengBMCGenome Biology1474-760X2019-10-0120111410.1186/s13059-019-1817-xRobust taxonomic classification of uncharted microbial sequences and bins with CAT and BATF. A. Bastiaan von Meijenfeldt0Ksenia Arkhipova1Diego D. Cambuy2Felipe H. Coutinho3Bas E. Dutilh4Theoretical Biology and Bioinformatics, Science for Life, Utrecht UniversityTheoretical Biology and Bioinformatics, Science for Life, Utrecht UniversityTheoretical Biology and Bioinformatics, Science for Life, Utrecht UniversityCentre for Molecular and Biomolecular Informatics, Radboud University Medical CentreTheoretical Biology and Bioinformatics, Science for Life, Utrecht UniversityAbstract Current-day metagenomics analyses increasingly involve de novo taxonomic classification of long DNA sequences and metagenome-assembled genomes. Here, we show that the conventional best-hit approach often leads to classifications that are too specific, especially when the sequences represent novel deep lineages. We present a classification method that integrates multiple signals to classify sequences (Contig Annotation Tool, CAT) and metagenome-assembled genomes (Bin Annotation Tool, BAT). Classifications are automatically made at low taxonomic ranks if closely related organisms are present in the reference database and at higher ranks otherwise. The result is a high classification precision even for sequences from considerably unknown organisms.http://link.springer.com/article/10.1186/s13059-019-1817-x
spellingShingle F. A. Bastiaan von Meijenfeldt
Ksenia Arkhipova
Diego D. Cambuy
Felipe H. Coutinho
Bas E. Dutilh
Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
Genome Biology
title Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_full Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_fullStr Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_full_unstemmed Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_short Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT
title_sort robust taxonomic classification of uncharted microbial sequences and bins with cat and bat
url http://link.springer.com/article/10.1186/s13059-019-1817-x
work_keys_str_mv AT fabastiaanvonmeijenfeldt robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT kseniaarkhipova robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT diegodcambuy robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT felipehcoutinho robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat
AT basedutilh robusttaxonomicclassificationofunchartedmicrobialsequencesandbinswithcatandbat