GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation

Identifying antimicrobial resistant (AMR) bacteria in metagenomics samples is essential for public health and food safety. Next-generation sequencing (NGS) technology has provided a powerful tool in identifying the genetic variation and constructing the correlations between genotype and phenotype in...

Full description

Bibliographic Details
Main Authors: Jiarui Li, Pengcheng Du, Adam Yongxin Ye, Yuanyuan Zhang, Chuan Song, Hui Zeng, Chen Chen
Format: Article
Language:English
Published: Oxford University Press 2019-02-01
Series:Genomics, Proteomics & Bioinformatics
Online Access:http://www.sciencedirect.com/science/article/pii/S167202291930066X
_version_ 1827159693479378944
author Jiarui Li
Pengcheng Du
Adam Yongxin Ye
Yuanyuan Zhang
Chuan Song
Hui Zeng
Chen Chen
author_facet Jiarui Li
Pengcheng Du
Adam Yongxin Ye
Yuanyuan Zhang
Chuan Song
Hui Zeng
Chen Chen
author_sort Jiarui Li
collection DOAJ
description Identifying antimicrobial resistant (AMR) bacteria in metagenomics samples is essential for public health and food safety. Next-generation sequencing (NGS) technology has provided a powerful tool in identifying the genetic variation and constructing the correlations between genotype and phenotype in humans and other species. However, for complex bacterial samples, there lacks a powerful bioinformatic tool to identify genetic polymorphisms or copy number variations (CNVs) for given genes. Here we provide a Bayesian framework for genotype estimation for mixtures of multiple bacteria, named as Genetic Polymorphisms Assignments (GPA). Simulation results showed that GPA has reduced the false discovery rate (FDR) and mean absolute error (MAE) in CNV and single nucleotide variant (SNV) identification. This framework was validated by whole-genome sequencing and Pool-seq data from Klebsiella pneumoniae with multiple bacteria mixture models, and showed the high accuracy in the allele fraction detections of CNVs and SNVs in AMR genes between two populations. The quantitative study on the changes of AMR genes fraction between two samples showed a good consistency with the AMR pattern observed in the individual strains. Also, the framework together with the genome annotation and population comparison tools has been integrated into an application, which could provide a complete solution for AMR gene identification and quantification in unculturable clinical samples. The GPA package is available at https://github.com/IID-DTH/GPA-package. Keywords: Next-generation sequencing, Pool-seq, Bayesian model, Metagenomics, Genetic polymorphisms
first_indexed 2024-03-08T18:01:45Z
format Article
id doaj.art-c4c4a78153984bdebfe0c9732250c9b2
institution Directory Open Access Journal
issn 1672-0229
language English
last_indexed 2025-03-21T00:01:11Z
publishDate 2019-02-01
publisher Oxford University Press
record_format Article
series Genomics, Proteomics & Bioinformatics
spelling doaj.art-c4c4a78153984bdebfe0c9732250c9b22024-08-03T11:35:50ZengOxford University PressGenomics, Proteomics & Bioinformatics1672-02292019-02-01171106117GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian EstimationJiarui Li0Pengcheng Du1Adam Yongxin Ye2Yuanyuan Zhang3Chuan Song4Hui Zeng5Chen Chen6Beijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, ChinaBeijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, ChinaCenter for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, ChinaBeijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, ChinaBeijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, ChinaBeijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China; Corresponding authors.Beijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China; Corresponding authors.Identifying antimicrobial resistant (AMR) bacteria in metagenomics samples is essential for public health and food safety. Next-generation sequencing (NGS) technology has provided a powerful tool in identifying the genetic variation and constructing the correlations between genotype and phenotype in humans and other species. However, for complex bacterial samples, there lacks a powerful bioinformatic tool to identify genetic polymorphisms or copy number variations (CNVs) for given genes. Here we provide a Bayesian framework for genotype estimation for mixtures of multiple bacteria, named as Genetic Polymorphisms Assignments (GPA). Simulation results showed that GPA has reduced the false discovery rate (FDR) and mean absolute error (MAE) in CNV and single nucleotide variant (SNV) identification. This framework was validated by whole-genome sequencing and Pool-seq data from Klebsiella pneumoniae with multiple bacteria mixture models, and showed the high accuracy in the allele fraction detections of CNVs and SNVs in AMR genes between two populations. The quantitative study on the changes of AMR genes fraction between two samples showed a good consistency with the AMR pattern observed in the individual strains. Also, the framework together with the genome annotation and population comparison tools has been integrated into an application, which could provide a complete solution for AMR gene identification and quantification in unculturable clinical samples. The GPA package is available at https://github.com/IID-DTH/GPA-package. Keywords: Next-generation sequencing, Pool-seq, Bayesian model, Metagenomics, Genetic polymorphismshttp://www.sciencedirect.com/science/article/pii/S167202291930066X
spellingShingle Jiarui Li
Pengcheng Du
Adam Yongxin Ye
Yuanyuan Zhang
Chuan Song
Hui Zeng
Chen Chen
GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation
Genomics, Proteomics & Bioinformatics
title GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation
title_full GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation
title_fullStr GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation
title_full_unstemmed GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation
title_short GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation
title_sort gpa a microbial genetic polymorphisms assignments tool in metagenomic analysis by bayesian estimation
url http://www.sciencedirect.com/science/article/pii/S167202291930066X
work_keys_str_mv AT jiaruili gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation
AT pengchengdu gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation
AT adamyongxinye gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation
AT yuanyuanzhang gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation
AT chuansong gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation
AT huizeng gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation
AT chenchen gpaamicrobialgeneticpolymorphismsassignmentstoolinmetagenomicanalysisbybayesianestimation