Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes

Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage e...

Full description

Bibliographic Details
Main Authors: Ramy Karam Aziz, Bhakti eDwivedi, Sajia eAkhter, Mya eBreitbart, Robert A Edwards
Format: Article
Language:English
Published: Frontiers Media S.A. 2015-05-01
Series:Frontiers in Microbiology
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fmicb.2015.00381/full
_version_ 1811234025629548544
author Ramy Karam Aziz
Ramy Karam Aziz
Ramy Karam Aziz
Bhakti eDwivedi
Sajia eAkhter
Mya eBreitbart
Robert A Edwards
Robert A Edwards
author_facet Ramy Karam Aziz
Ramy Karam Aziz
Ramy Karam Aziz
Bhakti eDwivedi
Sajia eAkhter
Mya eBreitbart
Robert A Edwards
Robert A Edwards
author_sort Ramy Karam Aziz
collection DOAJ
description Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.
first_indexed 2024-04-12T11:29:34Z
format Article
id doaj.art-286da8fbd03e4d2b8bfb53b90dc38164
institution Directory Open Access Journal
issn 1664-302X
language English
last_indexed 2024-04-12T11:29:34Z
publishDate 2015-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Microbiology
spelling doaj.art-286da8fbd03e4d2b8bfb53b90dc381642022-12-22T03:35:03ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2015-05-01610.3389/fmicb.2015.00381133863Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomesRamy Karam Aziz0Ramy Karam Aziz1Ramy Karam Aziz2Bhakti eDwivedi3Sajia eAkhter4Mya eBreitbart5Robert A Edwards6Robert A Edwards7Faculty of Pharmacy, Cairo UniversitySan Diego State UniversityArgonne National LaboratoryCollege of Marine Science, University of South FloridaSan Diego State UniversityCollege of Marine Science, University of South FloridaSan Diego State UniversityArgonne National LaboratoryPhages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.http://journal.frontiersin.org/Journal/10.3389/fmicb.2015.00381/fullMetagenomicsvirusdiversitystatisticsBacteriophagePhage
spellingShingle Ramy Karam Aziz
Ramy Karam Aziz
Ramy Karam Aziz
Bhakti eDwivedi
Sajia eAkhter
Mya eBreitbart
Robert A Edwards
Robert A Edwards
Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
Frontiers in Microbiology
Metagenomics
virus
diversity
statistics
Bacteriophage
Phage
title Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
title_full Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
title_fullStr Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
title_full_unstemmed Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
title_short Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
title_sort multidimensional metrics for estimating phage abundance distribution gene density and sequence coverage in metagenomes
topic Metagenomics
virus
diversity
statistics
Bacteriophage
Phage
url http://journal.frontiersin.org/Journal/10.3389/fmicb.2015.00381/full
work_keys_str_mv AT ramykaramaziz multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT ramykaramaziz multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT ramykaramaziz multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT bhaktiedwivedi multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT sajiaeakhter multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT myaebreitbart multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT robertaedwards multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes
AT robertaedwards multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes