Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
ABSTRACTMetagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. Euk...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
American Society for Microbiology
2023-12-01
|
Series: | mBio |
Subjects: | |
Online Access: | https://journals.asm.org/doi/10.1128/mbio.01676-23 |
_version_ | 1827573143432069120 |
---|---|
author | Harriet Alexander Sarah K. Hu Arianna I. Krinos Maria Pachiadaki Benjamin J. Tully Christopher J. Neely Taylor Reiter |
author_facet | Harriet Alexander Sarah K. Hu Arianna I. Krinos Maria Pachiadaki Benjamin J. Tully Christopher J. Neely Taylor Reiter |
author_sort | Harriet Alexander |
collection | DOAJ |
description | ABSTRACTMetagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. EukHeist is an automated pipeline to retrieve eukaryotic and prokaryotic metagenome-assembled genomes (MAGs) from large-scale metagenomic sequence data sets. We developed the EukHeist workflow to specifically process large amounts of both metagenomic and/or metatranscriptomic sequence data in an automated and reproducible fashion. Here, we applied EukHeist to the large-size fraction data (0.8–2,000 µm) from Tara Oceans to recover both eukaryotic and prokaryotic MAGs, which we refer to as TOPAZ (Tara Oceans Particle-Associated MAGs). The TOPAZ MAGs consisted of >900 environmentally relevant eukaryotic MAGs and >4,000 bacterial and archaeal MAGs. The bacterial and archaeal TOPAZ MAGs expand upon the phylogenetic diversity of likely particle- and host-associated taxa. We use these MAGs to demonstrate an approach to infer the putative trophic mode of the recovered eukaryotic MAGs. We also identify ecological cohorts of co-occurring MAGs, which are driven by specific environmental factors and putative host-microbe associations. These data together add to a number of growing resources of environmentally relevant eukaryotic genomic information. Complementary and expanded databases of MAGs, such as those provided through scalable pipelines like EukHeist, stand to advance our understanding of eukaryotic diversity through increased coverage of genomic representatives across the tree of life.IMPORTANCESingle-celled eukaryotes play ecologically significant roles in the marine environment, yet fundamental questions about their biodiversity, ecological function, and interactions remain. Environmental sequencing enables researchers to document naturally occurring protistan communities, without culturing bias, yet metagenomic and metatranscriptomic sequencing approaches cannot separate individual species from communities. To more completely capture the genomic content of mixed protistan populations, we can create bins of sequences that represent the same organism (metagenome-assembled genomes [MAGs]). We developed the EukHeist pipeline, which automates the binning of population-level eukaryotic and prokaryotic genomes from metagenomic reads. We show exciting insight into what protistan communities are present and their trophic roles in the ocean. Scalable computational tools, like EukHeist, may accelerate the identification of meaningful genetic signatures from large data sets and complement researchers’ efforts to leverage MAG databases for addressing ecological questions, resolving evolutionary relationships, and discovering potentially novel biodiversity. |
first_indexed | 2024-03-08T20:14:02Z |
format | Article |
id | doaj.art-6da3e443e6f0412da21c4f11709f54a0 |
institution | Directory Open Access Journal |
issn | 2150-7511 |
language | English |
last_indexed | 2024-03-08T20:14:02Z |
publishDate | 2023-12-01 |
publisher | American Society for Microbiology |
record_format | Article |
series | mBio |
spelling | doaj.art-6da3e443e6f0412da21c4f11709f54a02023-12-22T19:53:43ZengAmerican Society for MicrobiologymBio2150-75112023-12-0114610.1128/mbio.01676-23Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean planktonHarriet Alexander0Sarah K. Hu1Arianna I. Krinos2Maria Pachiadaki3Benjamin J. Tully4Christopher J. Neely5Taylor Reiter6Biology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USAMarine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USABiology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USABiology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USADepartment of Biological Sciences, University of Southern California, Los Angeles, California, USADepartment of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, USAPopulation Health and Reproduction, University of California, Davis, Davis, California, USAABSTRACTMetagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. EukHeist is an automated pipeline to retrieve eukaryotic and prokaryotic metagenome-assembled genomes (MAGs) from large-scale metagenomic sequence data sets. We developed the EukHeist workflow to specifically process large amounts of both metagenomic and/or metatranscriptomic sequence data in an automated and reproducible fashion. Here, we applied EukHeist to the large-size fraction data (0.8–2,000 µm) from Tara Oceans to recover both eukaryotic and prokaryotic MAGs, which we refer to as TOPAZ (Tara Oceans Particle-Associated MAGs). The TOPAZ MAGs consisted of >900 environmentally relevant eukaryotic MAGs and >4,000 bacterial and archaeal MAGs. The bacterial and archaeal TOPAZ MAGs expand upon the phylogenetic diversity of likely particle- and host-associated taxa. We use these MAGs to demonstrate an approach to infer the putative trophic mode of the recovered eukaryotic MAGs. We also identify ecological cohorts of co-occurring MAGs, which are driven by specific environmental factors and putative host-microbe associations. These data together add to a number of growing resources of environmentally relevant eukaryotic genomic information. Complementary and expanded databases of MAGs, such as those provided through scalable pipelines like EukHeist, stand to advance our understanding of eukaryotic diversity through increased coverage of genomic representatives across the tree of life.IMPORTANCESingle-celled eukaryotes play ecologically significant roles in the marine environment, yet fundamental questions about their biodiversity, ecological function, and interactions remain. Environmental sequencing enables researchers to document naturally occurring protistan communities, without culturing bias, yet metagenomic and metatranscriptomic sequencing approaches cannot separate individual species from communities. To more completely capture the genomic content of mixed protistan populations, we can create bins of sequences that represent the same organism (metagenome-assembled genomes [MAGs]). We developed the EukHeist pipeline, which automates the binning of population-level eukaryotic and prokaryotic genomes from metagenomic reads. We show exciting insight into what protistan communities are present and their trophic roles in the ocean. Scalable computational tools, like EukHeist, may accelerate the identification of meaningful genetic signatures from large data sets and complement researchers’ efforts to leverage MAG databases for addressing ecological questions, resolving evolutionary relationships, and discovering potentially novel biodiversity.https://journals.asm.org/doi/10.1128/mbio.01676-23metagenomicsprotistsgenomeseukaryotic metagenome-assembled genomes |
spellingShingle | Harriet Alexander Sarah K. Hu Arianna I. Krinos Maria Pachiadaki Benjamin J. Tully Christopher J. Neely Taylor Reiter Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton mBio metagenomics protists genomes eukaryotic metagenome-assembled genomes |
title | Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton |
title_full | Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton |
title_fullStr | Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton |
title_full_unstemmed | Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton |
title_short | Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton |
title_sort | eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton |
topic | metagenomics protists genomes eukaryotic metagenome-assembled genomes |
url | https://journals.asm.org/doi/10.1128/mbio.01676-23 |
work_keys_str_mv | AT harrietalexander eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton AT sarahkhu eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton AT ariannaikrinos eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton AT mariapachiadaki eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton AT benjaminjtully eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton AT christopherjneely eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton AT taylorreiter eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton |