A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes

With the birth of next-generation sequencing (NGS) technology, genomic data in public databases have increased exponentially. Unfortunately, exogenous contamination or intracellular parasite sequences in assemblies could confuse genomic analysis. Meanwhile, they can provide a valuable resource for s...

Full description

Bibliographic Details
Main Authors: Jiazheng Xie, Bowen Tan, Yi Zhang
Format: Article
Language:English
Published: MDPI AG 2023-07-01
Series:Animals
Subjects:
Online Access:https://www.mdpi.com/2076-2615/13/14/2243
_version_ 1797590559927304192
author Jiazheng Xie
Bowen Tan
Yi Zhang
author_facet Jiazheng Xie
Bowen Tan
Yi Zhang
author_sort Jiazheng Xie
collection DOAJ
description With the birth of next-generation sequencing (NGS) technology, genomic data in public databases have increased exponentially. Unfortunately, exogenous contamination or intracellular parasite sequences in assemblies could confuse genomic analysis. Meanwhile, they can provide a valuable resource for studies of host-microbe interactions. Here, we used a strategy based on DNA barcodes to scan protistan contamination in the GenBank WGS/TSA database. The results showed a total of 13,952 metazoan/animal assemblies in GenBank, where 17,036 contigs were found to be protistan contaminants in 1507 assemblies (10.8%), with even higher contamination rates in taxa of Cnidaria (150/281), Crustacea (237/480), and Mollusca (107/410). Taxonomic analysis of the protists derived from these contigs showed variations in abundance and evenness of protistan contamination across different metazoan taxa, reflecting host preferences of Apicomplexa, Ciliophora, Oomycota and Symbiodiniaceae for mammals and birds, Crustacea, insects, and Cnidaria, respectively. Finally, mitochondrial proteins COX1 and CYTB were predicted from these contigs, and the phylogenetic analysis corroborated the protistan origination and heterogeneous distribution of the contaminated contigs. Overall, in this study, we conducted a large-scale scan of protistan contaminant in genomic resources, and the protistan sequences detected will help uncover the protist diversity and relationships of these picoeukaryotes with Metazoa.
first_indexed 2024-03-11T01:22:15Z
format Article
id doaj.art-20597547362d4dd298ed311615b7e76d
institution Directory Open Access Journal
issn 2076-2615
language English
last_indexed 2024-03-11T01:22:15Z
publishDate 2023-07-01
publisher MDPI AG
record_format Article
series Animals
spelling doaj.art-20597547362d4dd298ed311615b7e76d2023-11-18T17:59:30ZengMDPI AGAnimals2076-26152023-07-011314224310.3390/ani13142243A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA BarcodesJiazheng Xie0Bowen Tan1Yi Zhang2Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaChongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaChongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, ChinaWith the birth of next-generation sequencing (NGS) technology, genomic data in public databases have increased exponentially. Unfortunately, exogenous contamination or intracellular parasite sequences in assemblies could confuse genomic analysis. Meanwhile, they can provide a valuable resource for studies of host-microbe interactions. Here, we used a strategy based on DNA barcodes to scan protistan contamination in the GenBank WGS/TSA database. The results showed a total of 13,952 metazoan/animal assemblies in GenBank, where 17,036 contigs were found to be protistan contaminants in 1507 assemblies (10.8%), with even higher contamination rates in taxa of Cnidaria (150/281), Crustacea (237/480), and Mollusca (107/410). Taxonomic analysis of the protists derived from these contigs showed variations in abundance and evenness of protistan contamination across different metazoan taxa, reflecting host preferences of Apicomplexa, Ciliophora, Oomycota and Symbiodiniaceae for mammals and birds, Crustacea, insects, and Cnidaria, respectively. Finally, mitochondrial proteins COX1 and CYTB were predicted from these contigs, and the phylogenetic analysis corroborated the protistan origination and heterogeneous distribution of the contaminated contigs. Overall, in this study, we conducted a large-scale scan of protistan contaminant in genomic resources, and the protistan sequences detected will help uncover the protist diversity and relationships of these picoeukaryotes with Metazoa.https://www.mdpi.com/2076-2615/13/14/2243protistDNA barcodecontaminationsymbiosisparasiteshost-microbe interactions
spellingShingle Jiazheng Xie
Bowen Tan
Yi Zhang
A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes
Animals
protist
DNA barcode
contamination
symbiosis
parasites
host-microbe interactions
title A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes
title_full A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes
title_fullStr A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes
title_full_unstemmed A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes
title_short A Large-Scale Study into Protist-Animal Interactions Based on Public Genomic Data Using DNA Barcodes
title_sort large scale study into protist animal interactions based on public genomic data using dna barcodes
topic protist
DNA barcode
contamination
symbiosis
parasites
host-microbe interactions
url https://www.mdpi.com/2076-2615/13/14/2243
work_keys_str_mv AT jiazhengxie alargescalestudyintoprotistanimalinteractionsbasedonpublicgenomicdatausingdnabarcodes
AT bowentan alargescalestudyintoprotistanimalinteractionsbasedonpublicgenomicdatausingdnabarcodes
AT yizhang alargescalestudyintoprotistanimalinteractionsbasedonpublicgenomicdatausingdnabarcodes
AT jiazhengxie largescalestudyintoprotistanimalinteractionsbasedonpublicgenomicdatausingdnabarcodes
AT bowentan largescalestudyintoprotistanimalinteractionsbasedonpublicgenomicdatausingdnabarcodes
AT yizhang largescalestudyintoprotistanimalinteractionsbasedonpublicgenomicdatausingdnabarcodes