Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes

Abstract Background Metagenome sampling bias for geographical location and lifestyle is partially responsible for the incomplete catalog of reference genomes of gut microbial species. Thus, genome assembly from currently under-represented populations may effectively expand the reference gut microbio...

Full description

Bibliographic Details
Main Authors: Chan Yeong Kim, Muyoung Lee, Sunmo Yang, Kyungnam Kim, Dongeun Yong, Hye Ryun Kim, Insuk Lee
Format: Article
Language:English
Published: BMC 2021-08-01
Series:Genome Medicine
Subjects:
Online Access:https://doi.org/10.1186/s13073-021-00950-7
_version_ 1819069313998389248
author Chan Yeong Kim
Muyoung Lee
Sunmo Yang
Kyungnam Kim
Dongeun Yong
Hye Ryun Kim
Insuk Lee
author_facet Chan Yeong Kim
Muyoung Lee
Sunmo Yang
Kyungnam Kim
Dongeun Yong
Hye Ryun Kim
Insuk Lee
author_sort Chan Yeong Kim
collection DOAJ
description Abstract Background Metagenome sampling bias for geographical location and lifestyle is partially responsible for the incomplete catalog of reference genomes of gut microbial species. Thus, genome assembly from currently under-represented populations may effectively expand the reference gut microbiome and improve taxonomic and functional profiling. Methods We assembled genomes using public whole-metagenomic shotgun sequencing (WMS) data for 110 and 645 fecal samples from India and Japan, respectively. In addition, we assembled genomes from newly generated WMS data for 90 fecal samples collected from Korea. Expecting genome assembly for low-abundance species may require a much deeper sequencing than that usually employed, so we performed ultra-deep WMS (> 30 Gbp or > 100 million read pairs) for the fecal samples from Korea. We consequently assembled 29,082 prokaryotic genomes from 845 fecal metagenomes for the three under-represented Asian countries and combined them with the Unified Human Gastrointestinal Genome (UHGG) to generate an expanded catalog, the Human Reference Gut Microbiome (HRGM). Results HRGM contains 232,098 non-redundant genomes for 5414 representative prokaryotic species including 780 that are novel, > 103 million unique proteins, and > 274 million single-nucleotide variants. This is an over 10% increase from the UHGG. The new 780 species were enriched for the Bacteroidaceae family, including species associated with high-fiber and seaweed-rich diets. Single-nucleotide variant density was positively associated with the speciation rate of gut commensals. We found that ultra-deep sequencing facilitated the assembly of genomes for low-abundance taxa, and deep sequencing (e.g., > 20 million read pairs) may be needed for the profiling of low-abundance taxa. Importantly, the HRGM significantly improved the taxonomic and functional classification of sequencing reads from fecal samples. Finally, analysis of human self-antigen homologs on the HRGM species genomes suggested that bacterial taxa with high cross-reactivity potential may contribute more to the pathogenesis of gut microbiome-associated diseases than those with low cross-reactivity potential by promoting inflammatory condition. Conclusions By including gut metagenomes from previously under-represented Asian countries, Korea, India, and Japan, we developed a substantially expanded microbiome catalog, HRGM. Information of the microbial genomes and coding genes is publicly available ( www.mbiomenet.org/HRGM/ ). HRGM will facilitate the identification and functional analysis of disease-associated gut microbiota.
first_indexed 2024-12-21T16:48:04Z
format Article
id doaj.art-5bfdf86bdfd14cfeac6f7d31a1e12745
institution Directory Open Access Journal
issn 1756-994X
language English
last_indexed 2024-12-21T16:48:04Z
publishDate 2021-08-01
publisher BMC
record_format Article
series Genome Medicine
spelling doaj.art-5bfdf86bdfd14cfeac6f7d31a1e127452022-12-21T18:56:56ZengBMCGenome Medicine1756-994X2021-08-0113112010.1186/s13073-021-00950-7Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomesChan Yeong Kim0Muyoung Lee1Sunmo Yang2Kyungnam Kim3Dongeun Yong4Hye Ryun Kim5Insuk Lee6Department of Biotechnology, College of Life Science & Biotechnology, Yonsei UniversityDepartment of Biotechnology, College of Life Science & Biotechnology, Yonsei UniversityDepartment of Biotechnology, College of Life Science & Biotechnology, Yonsei UniversityDepartment of Laboratory Medicine, Research Institute of Bacterial Resistance, Yonsei University College of MedicineDepartment of Laboratory Medicine, Research Institute of Bacterial Resistance, Yonsei University College of MedicineDivision of Medical Oncology, Department of Internal Medicine, Yonsei Cancer Center, Yonsei University College of MedicineDepartment of Biotechnology, College of Life Science & Biotechnology, Yonsei UniversityAbstract Background Metagenome sampling bias for geographical location and lifestyle is partially responsible for the incomplete catalog of reference genomes of gut microbial species. Thus, genome assembly from currently under-represented populations may effectively expand the reference gut microbiome and improve taxonomic and functional profiling. Methods We assembled genomes using public whole-metagenomic shotgun sequencing (WMS) data for 110 and 645 fecal samples from India and Japan, respectively. In addition, we assembled genomes from newly generated WMS data for 90 fecal samples collected from Korea. Expecting genome assembly for low-abundance species may require a much deeper sequencing than that usually employed, so we performed ultra-deep WMS (> 30 Gbp or > 100 million read pairs) for the fecal samples from Korea. We consequently assembled 29,082 prokaryotic genomes from 845 fecal metagenomes for the three under-represented Asian countries and combined them with the Unified Human Gastrointestinal Genome (UHGG) to generate an expanded catalog, the Human Reference Gut Microbiome (HRGM). Results HRGM contains 232,098 non-redundant genomes for 5414 representative prokaryotic species including 780 that are novel, > 103 million unique proteins, and > 274 million single-nucleotide variants. This is an over 10% increase from the UHGG. The new 780 species were enriched for the Bacteroidaceae family, including species associated with high-fiber and seaweed-rich diets. Single-nucleotide variant density was positively associated with the speciation rate of gut commensals. We found that ultra-deep sequencing facilitated the assembly of genomes for low-abundance taxa, and deep sequencing (e.g., > 20 million read pairs) may be needed for the profiling of low-abundance taxa. Importantly, the HRGM significantly improved the taxonomic and functional classification of sequencing reads from fecal samples. Finally, analysis of human self-antigen homologs on the HRGM species genomes suggested that bacterial taxa with high cross-reactivity potential may contribute more to the pathogenesis of gut microbiome-associated diseases than those with low cross-reactivity potential by promoting inflammatory condition. Conclusions By including gut metagenomes from previously under-represented Asian countries, Korea, India, and Japan, we developed a substantially expanded microbiome catalog, HRGM. Information of the microbial genomes and coding genes is publicly available ( www.mbiomenet.org/HRGM/ ). HRGM will facilitate the identification and functional analysis of disease-associated gut microbiota.https://doi.org/10.1186/s13073-021-00950-7Metagenomic shotgun sequencingHuman gut microbiomeMetagenome-assembled genomeCross-reactive antigen
spellingShingle Chan Yeong Kim
Muyoung Lee
Sunmo Yang
Kyungnam Kim
Dongeun Yong
Hye Ryun Kim
Insuk Lee
Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes
Genome Medicine
Metagenomic shotgun sequencing
Human gut microbiome
Metagenome-assembled genome
Cross-reactive antigen
title Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes
title_full Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes
title_fullStr Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes
title_full_unstemmed Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes
title_short Human reference gut microbiome catalog including newly assembled genomes from under-represented Asian metagenomes
title_sort human reference gut microbiome catalog including newly assembled genomes from under represented asian metagenomes
topic Metagenomic shotgun sequencing
Human gut microbiome
Metagenome-assembled genome
Cross-reactive antigen
url https://doi.org/10.1186/s13073-021-00950-7
work_keys_str_mv AT chanyeongkim humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes
AT muyounglee humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes
AT sunmoyang humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes
AT kyungnamkim humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes
AT dongeunyong humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes
AT hyeryunkim humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes
AT insuklee humanreferencegutmicrobiomecatalogincludingnewlyassembledgenomesfromunderrepresentedasianmetagenomes