happi: a hierarchical approach to pangenomics inference

Abstract Recovering metagenome-assembled genomes (MAGs) from shotgun sequencing data is an increasingly common task in microbiome studies, as MAGs provide deeper insight into the functional potential of both culturable and non-culturable microorganisms. However, metagenome-assembled genomes vary in...

Full description

Bibliographic Details
Main Authors: Pauline Trinh, David S. Clausen, Amy D. Willis
Format: Article
Language:English
Published: BMC 2023-09-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-023-03040-6
_version_ 1797559404169527296
author Pauline Trinh
David S. Clausen
Amy D. Willis
author_facet Pauline Trinh
David S. Clausen
Amy D. Willis
author_sort Pauline Trinh
collection DOAJ
description Abstract Recovering metagenome-assembled genomes (MAGs) from shotgun sequencing data is an increasingly common task in microbiome studies, as MAGs provide deeper insight into the functional potential of both culturable and non-culturable microorganisms. However, metagenome-assembled genomes vary in quality and may contain omissions and contamination. These errors present challenges for detecting genes and comparing gene enrichment across sample types. To address this, we propose happi, an approach to testing hypotheses about gene enrichment that accounts for genome quality. We illustrate the advantages of happi over existing approaches using published Saccharibacteria MAGs, Streptococcus thermophilus MAGs, and via simulation.
first_indexed 2024-03-10T17:44:53Z
format Article
id doaj.art-6f73b8bc342d44ea829ff6fba2400e07
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-03-10T17:44:53Z
publishDate 2023-09-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-6f73b8bc342d44ea829ff6fba2400e072023-11-20T09:35:12ZengBMCGenome Biology1474-760X2023-09-0124111510.1186/s13059-023-03040-6happi: a hierarchical approach to pangenomics inferencePauline Trinh0David S. Clausen1Amy D. Willis2Department of Environmental & Occupational Health Sciences, University of WashingtonDepartment of Biostatistics, University of WashingtonDepartment of Biostatistics, University of WashingtonAbstract Recovering metagenome-assembled genomes (MAGs) from shotgun sequencing data is an increasingly common task in microbiome studies, as MAGs provide deeper insight into the functional potential of both culturable and non-culturable microorganisms. However, metagenome-assembled genomes vary in quality and may contain omissions and contamination. These errors present challenges for detecting genes and comparing gene enrichment across sample types. To address this, we propose happi, an approach to testing hypotheses about gene enrichment that accounts for genome quality. We illustrate the advantages of happi over existing approaches using published Saccharibacteria MAGs, Streptococcus thermophilus MAGs, and via simulation.https://doi.org/10.1186/s13059-023-03040-6Shotgun metagenomicsMetagenome-assembled genomesMicrobiomeStatistical modelsHypothesis testing
spellingShingle Pauline Trinh
David S. Clausen
Amy D. Willis
happi: a hierarchical approach to pangenomics inference
Genome Biology
Shotgun metagenomics
Metagenome-assembled genomes
Microbiome
Statistical models
Hypothesis testing
title happi: a hierarchical approach to pangenomics inference
title_full happi: a hierarchical approach to pangenomics inference
title_fullStr happi: a hierarchical approach to pangenomics inference
title_full_unstemmed happi: a hierarchical approach to pangenomics inference
title_short happi: a hierarchical approach to pangenomics inference
title_sort happi a hierarchical approach to pangenomics inference
topic Shotgun metagenomics
Metagenome-assembled genomes
Microbiome
Statistical models
Hypothesis testing
url https://doi.org/10.1186/s13059-023-03040-6
work_keys_str_mv AT paulinetrinh happiahierarchicalapproachtopangenomicsinference
AT davidsclausen happiahierarchicalapproachtopangenomicsinference
AT amydwillis happiahierarchicalapproachtopangenomicsinference