Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton

ABSTRACTMetagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. Euk...

Full description

Bibliographic Details
Main Authors: Harriet Alexander, Sarah K. Hu, Arianna I. Krinos, Maria Pachiadaki, Benjamin J. Tully, Christopher J. Neely, Taylor Reiter
Format: Article
Language:English
Published: American Society for Microbiology 2023-12-01
Series:mBio
Subjects:
Online Access:https://journals.asm.org/doi/10.1128/mbio.01676-23
_version_ 1827573143432069120
author Harriet Alexander
Sarah K. Hu
Arianna I. Krinos
Maria Pachiadaki
Benjamin J. Tully
Christopher J. Neely
Taylor Reiter
author_facet Harriet Alexander
Sarah K. Hu
Arianna I. Krinos
Maria Pachiadaki
Benjamin J. Tully
Christopher J. Neely
Taylor Reiter
author_sort Harriet Alexander
collection DOAJ
description ABSTRACTMetagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. EukHeist is an automated pipeline to retrieve eukaryotic and prokaryotic metagenome-assembled genomes (MAGs) from large-scale metagenomic sequence data sets. We developed the EukHeist workflow to specifically process large amounts of both metagenomic and/or metatranscriptomic sequence data in an automated and reproducible fashion. Here, we applied EukHeist to the large-size fraction data (0.8–2,000 µm) from Tara Oceans to recover both eukaryotic and prokaryotic MAGs, which we refer to as TOPAZ (Tara Oceans Particle-Associated MAGs). The TOPAZ MAGs consisted of >900 environmentally relevant eukaryotic MAGs and >4,000 bacterial and archaeal MAGs. The bacterial and archaeal TOPAZ MAGs expand upon the phylogenetic diversity of likely particle- and host-associated taxa. We use these MAGs to demonstrate an approach to infer the putative trophic mode of the recovered eukaryotic MAGs. We also identify ecological cohorts of co-occurring MAGs, which are driven by specific environmental factors and putative host-microbe associations. These data together add to a number of growing resources of environmentally relevant eukaryotic genomic information. Complementary and expanded databases of MAGs, such as those provided through scalable pipelines like EukHeist, stand to advance our understanding of eukaryotic diversity through increased coverage of genomic representatives across the tree of life.IMPORTANCESingle-celled eukaryotes play ecologically significant roles in the marine environment, yet fundamental questions about their biodiversity, ecological function, and interactions remain. Environmental sequencing enables researchers to document naturally occurring protistan communities, without culturing bias, yet metagenomic and metatranscriptomic sequencing approaches cannot separate individual species from communities. To more completely capture the genomic content of mixed protistan populations, we can create bins of sequences that represent the same organism (metagenome-assembled genomes [MAGs]). We developed the EukHeist pipeline, which automates the binning of population-level eukaryotic and prokaryotic genomes from metagenomic reads. We show exciting insight into what protistan communities are present and their trophic roles in the ocean. Scalable computational tools, like EukHeist, may accelerate the identification of meaningful genetic signatures from large data sets and complement researchers’ efforts to leverage MAG databases for addressing ecological questions, resolving evolutionary relationships, and discovering potentially novel biodiversity.
first_indexed 2024-03-08T20:14:02Z
format Article
id doaj.art-6da3e443e6f0412da21c4f11709f54a0
institution Directory Open Access Journal
issn 2150-7511
language English
last_indexed 2024-03-08T20:14:02Z
publishDate 2023-12-01
publisher American Society for Microbiology
record_format Article
series mBio
spelling doaj.art-6da3e443e6f0412da21c4f11709f54a02023-12-22T19:53:43ZengAmerican Society for MicrobiologymBio2150-75112023-12-0114610.1128/mbio.01676-23Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean planktonHarriet Alexander0Sarah K. Hu1Arianna I. Krinos2Maria Pachiadaki3Benjamin J. Tully4Christopher J. Neely5Taylor Reiter6Biology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USAMarine Chemistry and Geochemistry, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USABiology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USABiology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USADepartment of Biological Sciences, University of Southern California, Los Angeles, California, USADepartment of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, USAPopulation Health and Reproduction, University of California, Davis, Davis, California, USAABSTRACTMetagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. EukHeist is an automated pipeline to retrieve eukaryotic and prokaryotic metagenome-assembled genomes (MAGs) from large-scale metagenomic sequence data sets. We developed the EukHeist workflow to specifically process large amounts of both metagenomic and/or metatranscriptomic sequence data in an automated and reproducible fashion. Here, we applied EukHeist to the large-size fraction data (0.8–2,000 µm) from Tara Oceans to recover both eukaryotic and prokaryotic MAGs, which we refer to as TOPAZ (Tara Oceans Particle-Associated MAGs). The TOPAZ MAGs consisted of >900 environmentally relevant eukaryotic MAGs and >4,000 bacterial and archaeal MAGs. The bacterial and archaeal TOPAZ MAGs expand upon the phylogenetic diversity of likely particle- and host-associated taxa. We use these MAGs to demonstrate an approach to infer the putative trophic mode of the recovered eukaryotic MAGs. We also identify ecological cohorts of co-occurring MAGs, which are driven by specific environmental factors and putative host-microbe associations. These data together add to a number of growing resources of environmentally relevant eukaryotic genomic information. Complementary and expanded databases of MAGs, such as those provided through scalable pipelines like EukHeist, stand to advance our understanding of eukaryotic diversity through increased coverage of genomic representatives across the tree of life.IMPORTANCESingle-celled eukaryotes play ecologically significant roles in the marine environment, yet fundamental questions about their biodiversity, ecological function, and interactions remain. Environmental sequencing enables researchers to document naturally occurring protistan communities, without culturing bias, yet metagenomic and metatranscriptomic sequencing approaches cannot separate individual species from communities. To more completely capture the genomic content of mixed protistan populations, we can create bins of sequences that represent the same organism (metagenome-assembled genomes [MAGs]). We developed the EukHeist pipeline, which automates the binning of population-level eukaryotic and prokaryotic genomes from metagenomic reads. We show exciting insight into what protistan communities are present and their trophic roles in the ocean. Scalable computational tools, like EukHeist, may accelerate the identification of meaningful genetic signatures from large data sets and complement researchers’ efforts to leverage MAG databases for addressing ecological questions, resolving evolutionary relationships, and discovering potentially novel biodiversity.https://journals.asm.org/doi/10.1128/mbio.01676-23metagenomicsprotistsgenomeseukaryotic metagenome-assembled genomes
spellingShingle Harriet Alexander
Sarah K. Hu
Arianna I. Krinos
Maria Pachiadaki
Benjamin J. Tully
Christopher J. Neely
Taylor Reiter
Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
mBio
metagenomics
protists
genomes
eukaryotic metagenome-assembled genomes
title Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
title_full Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
title_fullStr Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
title_full_unstemmed Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
title_short Eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
title_sort eukaryotic genomes from a global metagenomic data set illuminate trophic modes and biogeography of ocean plankton
topic metagenomics
protists
genomes
eukaryotic metagenome-assembled genomes
url https://journals.asm.org/doi/10.1128/mbio.01676-23
work_keys_str_mv AT harrietalexander eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton
AT sarahkhu eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton
AT ariannaikrinos eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton
AT mariapachiadaki eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton
AT benjaminjtully eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton
AT christopherjneely eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton
AT taylorreiter eukaryoticgenomesfromaglobalmetagenomicdatasetilluminatetrophicmodesandbiogeographyofoceanplankton