Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.

Infections are a serious health concern worldwide, particularly in vulnerable populations such as the immunocompromised, elderly, and young. Advances in metagenomic sequencing availability, speed, and decreased cost offer the opportunity to supplement or even replace culture-based identification of...

Full description

Bibliographic Details
Main Authors: George S Watts, James E Thornton, Ken Youens-Clark, Alise J Ponsero, Marvin J Slepian, Emmanuel Menashi, Charles Hu, Wuquan Deng, David G Armstrong, Spenser Reed, Lee D Cranmer, Bonnie L Hurwitz
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2019-11-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1006863
_version_ 1818582124534431744
author George S Watts
James E Thornton
Ken Youens-Clark
Alise J Ponsero
Marvin J Slepian
Emmanuel Menashi
Charles Hu
Wuquan Deng
David G Armstrong
Spenser Reed
Lee D Cranmer
Bonnie L Hurwitz
author_facet George S Watts
James E Thornton
Ken Youens-Clark
Alise J Ponsero
Marvin J Slepian
Emmanuel Menashi
Charles Hu
Wuquan Deng
David G Armstrong
Spenser Reed
Lee D Cranmer
Bonnie L Hurwitz
author_sort George S Watts
collection DOAJ
description Infections are a serious health concern worldwide, particularly in vulnerable populations such as the immunocompromised, elderly, and young. Advances in metagenomic sequencing availability, speed, and decreased cost offer the opportunity to supplement or even replace culture-based identification of pathogens with DNA sequence-based diagnostics. Adopting metagenomic analysis for clinical use requires that all aspects of the workflow are optimized and tested, including data analysis and computational time and resources. We tested the accuracy, sensitivity, and resource requirements of three top metagenomic taxonomic classifiers that use fast k-mer based algorithms: Centrifuge, CLARK, and KrakenUniq. Binary mixtures of bacteria showed all three reliably identified organisms down to 1% relative abundance, while only the relative abundance estimates of Centrifuge and CLARK were accurate. All three classifiers identified the organisms present in their default databases from a mock bacterial community of 20 organisms, but only Centrifuge had no false positives. In addition, Centrifuge required far less computational resources and time for analysis. Centrifuge analysis of metagenomes obtained from samples of VAP, infected DFUs, and FN showed Centrifuge identified pathogenic bacteria and one virus that were corroborated by culture or a clinical PCR assay. Importantly, in both diabetic foot ulcer patients, metagenomic sequencing identified pathogens 4-6 weeks before culture. Finally, we show that Centrifuge results were minimally affected by elimination of time-consuming read quality control and host screening steps.
first_indexed 2024-12-16T07:44:24Z
format Article
id doaj.art-a45dc5c6d5854b179b7341d782e0bba9
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-12-16T07:44:24Z
publishDate 2019-11-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-a45dc5c6d5854b179b7341d782e0bba92022-12-21T22:39:01ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582019-11-011511e100686310.1371/journal.pcbi.1006863Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.George S WattsJames E ThorntonKen Youens-ClarkAlise J PonseroMarvin J SlepianEmmanuel MenashiCharles HuWuquan DengDavid G ArmstrongSpenser ReedLee D CranmerBonnie L HurwitzInfections are a serious health concern worldwide, particularly in vulnerable populations such as the immunocompromised, elderly, and young. Advances in metagenomic sequencing availability, speed, and decreased cost offer the opportunity to supplement or even replace culture-based identification of pathogens with DNA sequence-based diagnostics. Adopting metagenomic analysis for clinical use requires that all aspects of the workflow are optimized and tested, including data analysis and computational time and resources. We tested the accuracy, sensitivity, and resource requirements of three top metagenomic taxonomic classifiers that use fast k-mer based algorithms: Centrifuge, CLARK, and KrakenUniq. Binary mixtures of bacteria showed all three reliably identified organisms down to 1% relative abundance, while only the relative abundance estimates of Centrifuge and CLARK were accurate. All three classifiers identified the organisms present in their default databases from a mock bacterial community of 20 organisms, but only Centrifuge had no false positives. In addition, Centrifuge required far less computational resources and time for analysis. Centrifuge analysis of metagenomes obtained from samples of VAP, infected DFUs, and FN showed Centrifuge identified pathogenic bacteria and one virus that were corroborated by culture or a clinical PCR assay. Importantly, in both diabetic foot ulcer patients, metagenomic sequencing identified pathogens 4-6 weeks before culture. Finally, we show that Centrifuge results were minimally affected by elimination of time-consuming read quality control and host screening steps.https://doi.org/10.1371/journal.pcbi.1006863
spellingShingle George S Watts
James E Thornton
Ken Youens-Clark
Alise J Ponsero
Marvin J Slepian
Emmanuel Menashi
Charles Hu
Wuquan Deng
David G Armstrong
Spenser Reed
Lee D Cranmer
Bonnie L Hurwitz
Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.
PLoS Computational Biology
title Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.
title_full Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.
title_fullStr Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.
title_full_unstemmed Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.
title_short Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.
title_sort identification and quantitation of clinically relevant microbes in patient samples comparison of three k mer based classifiers for speed accuracy and sensitivity
url https://doi.org/10.1371/journal.pcbi.1006863
work_keys_str_mv AT georgeswatts identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT jamesethornton identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT kenyouensclark identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT alisejponsero identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT marvinjslepian identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT emmanuelmenashi identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT charleshu identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT wuquandeng identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT davidgarmstrong identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT spenserreed identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT leedcranmer identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity
AT bonnielhurwitz identificationandquantitationofclinicallyrelevantmicrobesinpatientsamplescomparisonofthreekmerbasedclassifiersforspeedaccuracyandsensitivity