Studying pathogens degrades BLAST-based pathogen identification
Abstract As synthetic biology becomes increasingly capable and accessible, it is likewise increasingly critical to be able to make accurate biosecurity determinations regarding the pathogenicity or toxicity of particular nucleic acid or amino acid sequences. At present, this is typically done using...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-04-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-023-32481-z |
_version_ | 1827970522000916480 |
---|---|
author | Jacob Beal Adam Clore Jeff Manthey |
author_facet | Jacob Beal Adam Clore Jeff Manthey |
author_sort | Jacob Beal |
collection | DOAJ |
description | Abstract As synthetic biology becomes increasingly capable and accessible, it is likewise increasingly critical to be able to make accurate biosecurity determinations regarding the pathogenicity or toxicity of particular nucleic acid or amino acid sequences. At present, this is typically done using the BLAST algorithm to determine the best match with sequences in the NCBI nucleic acid and protein databases. Neither BLAST nor any of the NCBI databases, however, are actually designed for biosafety determination. Critically, taxonomic errors or ambiguities in the NCBI nucleic acid and protein databases can also cause errors in BLAST-based taxonomic categorization. With heavily studied taxa and frequently used biotechnology tools, even low frequency taxonomic categorization issues can lead to high rates of errors in biosecurity decision-making. Here we focus on the implications for false positives, finding that BLAST against NCBI’s protein database will now incorrectly categorize a number of commonly used biotechnology tool sequences as the pathogens or toxins with which they have been used. Paradoxically, this implies that problems are expected to be most acute for the pathogens and toxins of highest interest and for the most widely used biotechnology tools. We thus conclude that biosecurity tools should shift away from BLAST against general purpose databases and towards new methods that are specifically tailored for biosafety purposes. |
first_indexed | 2024-04-09T18:55:09Z |
format | Article |
id | doaj.art-a6a45a2c80124d63b21966c7389b2522 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-09T18:55:09Z |
publishDate | 2023-04-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-a6a45a2c80124d63b21966c7389b25222023-04-09T11:15:11ZengNature PortfolioScientific Reports2045-23222023-04-011311610.1038/s41598-023-32481-zStudying pathogens degrades BLAST-based pathogen identificationJacob Beal0Adam Clore1Jeff Manthey2Raytheon BBNIntegrated DNA TechnologiesIntegrated DNA TechnologiesAbstract As synthetic biology becomes increasingly capable and accessible, it is likewise increasingly critical to be able to make accurate biosecurity determinations regarding the pathogenicity or toxicity of particular nucleic acid or amino acid sequences. At present, this is typically done using the BLAST algorithm to determine the best match with sequences in the NCBI nucleic acid and protein databases. Neither BLAST nor any of the NCBI databases, however, are actually designed for biosafety determination. Critically, taxonomic errors or ambiguities in the NCBI nucleic acid and protein databases can also cause errors in BLAST-based taxonomic categorization. With heavily studied taxa and frequently used biotechnology tools, even low frequency taxonomic categorization issues can lead to high rates of errors in biosecurity decision-making. Here we focus on the implications for false positives, finding that BLAST against NCBI’s protein database will now incorrectly categorize a number of commonly used biotechnology tool sequences as the pathogens or toxins with which they have been used. Paradoxically, this implies that problems are expected to be most acute for the pathogens and toxins of highest interest and for the most widely used biotechnology tools. We thus conclude that biosecurity tools should shift away from BLAST against general purpose databases and towards new methods that are specifically tailored for biosafety purposes.https://doi.org/10.1038/s41598-023-32481-z |
spellingShingle | Jacob Beal Adam Clore Jeff Manthey Studying pathogens degrades BLAST-based pathogen identification Scientific Reports |
title | Studying pathogens degrades BLAST-based pathogen identification |
title_full | Studying pathogens degrades BLAST-based pathogen identification |
title_fullStr | Studying pathogens degrades BLAST-based pathogen identification |
title_full_unstemmed | Studying pathogens degrades BLAST-based pathogen identification |
title_short | Studying pathogens degrades BLAST-based pathogen identification |
title_sort | studying pathogens degrades blast based pathogen identification |
url | https://doi.org/10.1038/s41598-023-32481-z |
work_keys_str_mv | AT jacobbeal studyingpathogensdegradesblastbasedpathogenidentification AT adamclore studyingpathogensdegradesblastbasedpathogenidentification AT jeffmanthey studyingpathogensdegradesblastbasedpathogenidentification |