Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model

ABSTRACT Metagenomic next-generation sequencing (mNGS) can accurately detect pathogens in clinical samples. However, wet-lab contamination constrains mNGS analysis and may result in erroneous interpretation of results. Many existing methods rely on large-scale observational microbiome studies and ma...

Full description

Bibliographic Details
Main Authors: Juan Du, Jingjia Zhang, Dong Zhang, Yiwen Zhou, Pengfei Wu, Wenchao Ding, Jun Wang, Chuan Ouyang, Qiwen Yang
Format: Article
Language:English
Published: American Society for Microbiology 2022-10-01
Series:Microbiology Spectrum
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/spectrum.01779-22
_version_ 1811198188302893056
author Juan Du
Jingjia Zhang
Dong Zhang
Yiwen Zhou
Pengfei Wu
Wenchao Ding
Jun Wang
Chuan Ouyang
Qiwen Yang
author_facet Juan Du
Jingjia Zhang
Dong Zhang
Yiwen Zhou
Pengfei Wu
Wenchao Ding
Jun Wang
Chuan Ouyang
Qiwen Yang
author_sort Juan Du
collection DOAJ
description ABSTRACT Metagenomic next-generation sequencing (mNGS) can accurately detect pathogens in clinical samples. However, wet-lab contamination constrains mNGS analysis and may result in erroneous interpretation of results. Many existing methods rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS tests. By generation of a pretrained profile of common laboratory contaminants, we developed an mNGS noise-filtering model based on the inverse linear relationship between microbial sequencing reads and sample library concentration, named the background elimination and correction by library concentration-normalized (BECLEAN) model. Its efficacy was evaluated with bacteria- and yeast-spiked samples and 28 cerebrospinal fluid (CSF) specimens. The diagnostic accuracy, precision, sensitivity, and specificity of BECLEAN with reference to conventional methods and diagnosis were 92.9%, 86.7%, 100%, and 86.7%, respectively. BECLEAN led to a dramatic reduction of background noise without affecting the true-positive rate and thus can provide a time-saving and convenient tool in various clinical settings. IMPORTANCE Most of the existing methods to remove wet-lab contamination rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS testing in individual cases. In clinical settings, only a handful of samples might be sequenced in a run. The lab-specific microbiome can complicate existing statistical approaches for removing contamination from small-scale clinical metagenomic sequencing data sets; thus, use of a preliminary lab-specific training set is necessary. Our study provides a rapid and accurate background-filtering tool for clinical metagenomic sequencing by generation of a pretrained profile of common laboratory contaminants. Notably, our work demonstrates that the inverse linear relationship between microbial sequencing reads and library concentration can serve to identify true contaminants and evaluate the relative abundance of a taxon in samples by comparing the observed microbial reads to the model-predicted value. Our findings extend the previously published research and demonstrate confirmatory results in clinical settings.
first_indexed 2024-04-12T01:26:34Z
format Article
id doaj.art-b9af8a7946904d2b92141b6b868db7e2
institution Directory Open Access Journal
issn 2165-0497
language English
last_indexed 2024-04-12T01:26:34Z
publishDate 2022-10-01
publisher American Society for Microbiology
record_format Article
series Microbiology Spectrum
spelling doaj.art-b9af8a7946904d2b92141b6b868db7e22022-12-22T03:53:38ZengAmerican Society for MicrobiologyMicrobiology Spectrum2165-04972022-10-0110510.1128/spectrum.01779-22Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized ModelJuan Du0Jingjia Zhang1Dong Zhang2Yiwen Zhou3Pengfei Wu4Wenchao Ding5Jun Wang6Chuan Ouyang7Qiwen Yang8Department of Clinical Laboratory, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, ChinaDepartment of Clinical Laboratory, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, ChinaDepartment of Clinical Laboratory, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, ChinaHangzhou Matridx Biotechnology Co., Ltd., Hangzhou, Zhejiang, ChinaHangzhou Matridx Biotechnology Co., Ltd., Hangzhou, Zhejiang, ChinaHangzhou Matridx Biotechnology Co., Ltd., Hangzhou, Zhejiang, ChinaHangzhou Matridx Biotechnology Co., Ltd., Hangzhou, Zhejiang, ChinaHangzhou Matridx Biotechnology Co., Ltd., Hangzhou, Zhejiang, ChinaDepartment of Clinical Laboratory, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, ChinaABSTRACT Metagenomic next-generation sequencing (mNGS) can accurately detect pathogens in clinical samples. However, wet-lab contamination constrains mNGS analysis and may result in erroneous interpretation of results. Many existing methods rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS tests. By generation of a pretrained profile of common laboratory contaminants, we developed an mNGS noise-filtering model based on the inverse linear relationship between microbial sequencing reads and sample library concentration, named the background elimination and correction by library concentration-normalized (BECLEAN) model. Its efficacy was evaluated with bacteria- and yeast-spiked samples and 28 cerebrospinal fluid (CSF) specimens. The diagnostic accuracy, precision, sensitivity, and specificity of BECLEAN with reference to conventional methods and diagnosis were 92.9%, 86.7%, 100%, and 86.7%, respectively. BECLEAN led to a dramatic reduction of background noise without affecting the true-positive rate and thus can provide a time-saving and convenient tool in various clinical settings. IMPORTANCE Most of the existing methods to remove wet-lab contamination rely on large-scale observational microbiome studies and may not be applicable to clinical mNGS testing in individual cases. In clinical settings, only a handful of samples might be sequenced in a run. The lab-specific microbiome can complicate existing statistical approaches for removing contamination from small-scale clinical metagenomic sequencing data sets; thus, use of a preliminary lab-specific training set is necessary. Our study provides a rapid and accurate background-filtering tool for clinical metagenomic sequencing by generation of a pretrained profile of common laboratory contaminants. Notably, our work demonstrates that the inverse linear relationship between microbial sequencing reads and library concentration can serve to identify true contaminants and evaluate the relative abundance of a taxon in samples by comparing the observed microbial reads to the model-predicted value. Our findings extend the previously published research and demonstrate confirmatory results in clinical settings.https://journals.asm.org/doi/10.1128/spectrum.01779-22metagenomic sequencingbackground filteringpremodelinglinear regressionclinical settings
spellingShingle Juan Du
Jingjia Zhang
Dong Zhang
Yiwen Zhou
Pengfei Wu
Wenchao Ding
Jun Wang
Chuan Ouyang
Qiwen Yang
Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model
Microbiology Spectrum
metagenomic sequencing
background filtering
premodeling
linear regression
clinical settings
title Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model
title_full Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model
title_fullStr Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model
title_full_unstemmed Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model
title_short Background Filtering of Clinical Metagenomic Sequencing with a Library Concentration-Normalized Model
title_sort background filtering of clinical metagenomic sequencing with a library concentration normalized model
topic metagenomic sequencing
background filtering
premodeling
linear regression
clinical settings
url https://journals.asm.org/doi/10.1128/spectrum.01779-22
work_keys_str_mv AT juandu backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT jingjiazhang backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT dongzhang backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT yiwenzhou backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT pengfeiwu backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT wenchaoding backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT junwang backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT chuanouyang backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel
AT qiwenyang backgroundfilteringofclinicalmetagenomicsequencingwithalibraryconcentrationnormalizedmodel