Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes
Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage e...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2015-05-01
|
Series: | Frontiers in Microbiology |
Subjects: | |
Online Access: | http://journal.frontiersin.org/Journal/10.3389/fmicb.2015.00381/full |
_version_ | 1811234025629548544 |
---|---|
author | Ramy Karam Aziz Ramy Karam Aziz Ramy Karam Aziz Bhakti eDwivedi Sajia eAkhter Mya eBreitbart Robert A Edwards Robert A Edwards |
author_facet | Ramy Karam Aziz Ramy Karam Aziz Ramy Karam Aziz Bhakti eDwivedi Sajia eAkhter Mya eBreitbart Robert A Edwards Robert A Edwards |
author_sort | Ramy Karam Aziz |
collection | DOAJ |
description | Phages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution. |
first_indexed | 2024-04-12T11:29:34Z |
format | Article |
id | doaj.art-286da8fbd03e4d2b8bfb53b90dc38164 |
institution | Directory Open Access Journal |
issn | 1664-302X |
language | English |
last_indexed | 2024-04-12T11:29:34Z |
publishDate | 2015-05-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Microbiology |
spelling | doaj.art-286da8fbd03e4d2b8bfb53b90dc381642022-12-22T03:35:03ZengFrontiers Media S.A.Frontiers in Microbiology1664-302X2015-05-01610.3389/fmicb.2015.00381133863Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomesRamy Karam Aziz0Ramy Karam Aziz1Ramy Karam Aziz2Bhakti eDwivedi3Sajia eAkhter4Mya eBreitbart5Robert A Edwards6Robert A Edwards7Faculty of Pharmacy, Cairo UniversitySan Diego State UniversityArgonne National LaboratoryCollege of Marine Science, University of South FloridaSan Diego State UniversityCollege of Marine Science, University of South FloridaSan Diego State UniversityArgonne National LaboratoryPhages are the most abundant biological entities on Earth and play major ecological roles, yet the current sequenced phage genomes do not adequately represent their diversity, and little is known about the abundance and distribution of these sequenced genomes in nature. Although the study of phage ecology has benefited tremendously from the emergence of metagenomic sequencing, a systematic survey of phage genes and genomes in various ecosystems is still lacking, and fundamental questions about phage biology, lifestyle, and ecology remain unanswered. To address these questions and improve comparative analysis of phages in different metagenomes, we screened a core set of publicly available metagenomic samples for sequences related to completely sequenced phages using the web tool, Phage Eco-Locator. We then adopted and deployed an array of mathematical and statistical metrics for a multidimensional estimation of the abundance and distribution of phage genes and genomes in various ecosystems. Experiments using those metrics individually showed their usefulness in emphasizing the pervasive, yet uneven, distribution of known phage sequences in environmental metagenomes. Using these metrics in combination allowed us to resolve phage genomes into clusters that correlated with their genotypes and taxonomic classes as well as their ecological properties. We propose adding this set of metrics to current metaviromic analysis pipelines, where they can provide insight regarding phage mosaicism, habitat specificity, and evolution.http://journal.frontiersin.org/Journal/10.3389/fmicb.2015.00381/fullMetagenomicsvirusdiversitystatisticsBacteriophagePhage |
spellingShingle | Ramy Karam Aziz Ramy Karam Aziz Ramy Karam Aziz Bhakti eDwivedi Sajia eAkhter Mya eBreitbart Robert A Edwards Robert A Edwards Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes Frontiers in Microbiology Metagenomics virus diversity statistics Bacteriophage Phage |
title | Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes |
title_full | Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes |
title_fullStr | Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes |
title_full_unstemmed | Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes |
title_short | Multidimensional metrics for estimating phage abundance, distribution, gene density, and sequence coverage in metagenomes |
title_sort | multidimensional metrics for estimating phage abundance distribution gene density and sequence coverage in metagenomes |
topic | Metagenomics virus diversity statistics Bacteriophage Phage |
url | http://journal.frontiersin.org/Journal/10.3389/fmicb.2015.00381/full |
work_keys_str_mv | AT ramykaramaziz multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT ramykaramaziz multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT ramykaramaziz multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT bhaktiedwivedi multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT sajiaeakhter multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT myaebreitbart multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT robertaedwards multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes AT robertaedwards multidimensionalmetricsforestimatingphageabundancedistributiongenedensityandsequencecoverageinmetagenomes |