BLAST-based validation of metagenomic sequence assignments

When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomi...

Full description

Bibliographic Details
Main Authors: Adam L. Bazinet, Brian D. Ondov, Daniel D. Sommer, Shashikala Ratnayake
Format: Article
Language:English
Published: PeerJ Inc. 2018-05-01
Series:PeerJ
Subjects:
Online Access:https://peerj.com/articles/4892.pdf
_version_ 1827607580561637376
author Adam L. Bazinet
Brian D. Ondov
Daniel D. Sommer
Shashikala Ratnayake
author_facet Adam L. Bazinet
Brian D. Ondov
Daniel D. Sommer
Shashikala Ratnayake
author_sort Adam L. Bazinet
collection DOAJ
description When performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by “first-pass” classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present a BLAST-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering of BLAST results determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available.
first_indexed 2024-03-09T06:57:10Z
format Article
id doaj.art-7042156e23a7450aadebfade42142cec
institution Directory Open Access Journal
issn 2167-8359
language English
last_indexed 2024-03-09T06:57:10Z
publishDate 2018-05-01
publisher PeerJ Inc.
record_format Article
series PeerJ
spelling doaj.art-7042156e23a7450aadebfade42142cec2023-12-03T10:02:22ZengPeerJ Inc.PeerJ2167-83592018-05-016e489210.7717/peerj.4892BLAST-based validation of metagenomic sequence assignmentsAdam L. Bazinet0Brian D. Ondov1Daniel D. Sommer2Shashikala Ratnayake3National Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USANational Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USANational Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USANational Biodefense Analysis and Countermeasures Center, Fort Detrick, MD, USAWhen performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by “first-pass” classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present a BLAST-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering of BLAST results determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available.https://peerj.com/articles/4892.pdfBLASTMetagenomicsSequence classificationTaxonomic assignmentBioforensicsValidation
spellingShingle Adam L. Bazinet
Brian D. Ondov
Daniel D. Sommer
Shashikala Ratnayake
BLAST-based validation of metagenomic sequence assignments
PeerJ
BLAST
Metagenomics
Sequence classification
Taxonomic assignment
Bioforensics
Validation
title BLAST-based validation of metagenomic sequence assignments
title_full BLAST-based validation of metagenomic sequence assignments
title_fullStr BLAST-based validation of metagenomic sequence assignments
title_full_unstemmed BLAST-based validation of metagenomic sequence assignments
title_short BLAST-based validation of metagenomic sequence assignments
title_sort blast based validation of metagenomic sequence assignments
topic BLAST
Metagenomics
Sequence classification
Taxonomic assignment
Bioforensics
Validation
url https://peerj.com/articles/4892.pdf
work_keys_str_mv AT adamlbazinet blastbasedvalidationofmetagenomicsequenceassignments
AT briandondov blastbasedvalidationofmetagenomicsequenceassignments
AT danieldsommer blastbasedvalidationofmetagenomicsequenceassignments
AT shashikalaratnayake blastbasedvalidationofmetagenomicsequenceassignments