Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons

Abstract Background Efficient regulation of bacterial genes in response to the environmental stimulus results in unique gene clusters known as operons. Lack of complete operonic reference and functional information makes the prediction of metagenomic operons a challenging task; thus, opening new per...

Full description

Bibliographic Details
Main Authors: Syed Shujaat Ali Zaidi, Masood Ur Rehman Kayani, Xuegong Zhang, Younan Ouyang, Imran Haider Shamsi
Format: Article
Language:English
Published: BMC 2021-01-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-020-07357-5
_version_ 1818676180194164736
author Syed Shujaat Ali Zaidi
Masood Ur Rehman Kayani
Xuegong Zhang
Younan Ouyang
Imran Haider Shamsi
author_facet Syed Shujaat Ali Zaidi
Masood Ur Rehman Kayani
Xuegong Zhang
Younan Ouyang
Imran Haider Shamsi
author_sort Syed Shujaat Ali Zaidi
collection DOAJ
description Abstract Background Efficient regulation of bacterial genes in response to the environmental stimulus results in unique gene clusters known as operons. Lack of complete operonic reference and functional information makes the prediction of metagenomic operons a challenging task; thus, opening new perspectives on the interpretation of the host-microbe interactions. Results In this work, we identified whole-genome and metagenomic operons via MetaRon (Metagenome and whole-genome opeRon prediction pipeline). MetaRon identifies operons without any experimental or functional information. MetaRon was implemented on datasets with different levels of complexity and information. Starting from its application on whole-genome to simulated mixture of three whole-genomes (E. coli MG1655, Mycobacterium tuberculosis H37Rv and Bacillus subtilis str. 16), E. coli c20 draft genome extracted from chicken gut and finally on 145 whole-metagenome data samples from human gut. MetaRon consistently achieved high operon prediction sensitivity, specificity and accuracy across E. coli whole-genome (97.8, 94.1 and 92.4%), simulated genome (93.7, 75.5 and 88.1%) and E. coli c20 (87, 91 and 88%,), respectively. Finally, we identified 1,232,407 unique operons from 145 paired-end human gut metagenome samples. We also report strong association of type 2 diabetes with Maltose phosphorylase (K00691), 3-deoxy-D-glycero-D-galacto-nononate 9-phosphate synthase (K21279) and an uncharacterized protein (K07101). Conclusion With MetaRon, we were able to remove two notable limitations of existing whole-genome operon prediction methods: (1) generalizability (ability to predict operons in unrelated bacterial genomes), and (2) whole-genome and metagenomic data management. We also demonstrate the use of operons as a subset to represent the trends of secondary metabolites in whole-metagenome data and the role of secondary metabolites in the occurrence of disease condition. Using operonic data from metagenome to study secondary metabolic trends will significantly reduce the data volume to more precise data. Furthermore, the identification of metabolic pathways associated with the occurrence of type 2 diabetes (T2D) also presents another dimension of analyzing the human gut metagenome. Presumably, this study is the first organized effort to predict metagenomic operons and perform a detailed analysis in association with a disease, in this case type 2 diabetes. The application of MetaRon to metagenomic data at diverse scale will be beneficial to understand the gene regulation and therapeutic metagenomics.
first_indexed 2024-12-17T08:39:22Z
format Article
id doaj.art-caf407426b3f4db6ad8bf9d0d65f377b
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-17T08:39:22Z
publishDate 2021-01-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-caf407426b3f4db6ad8bf9d0d65f377b2022-12-21T21:56:24ZengBMCBMC Genomics1471-21642021-01-0122111410.1186/s12864-020-07357-5Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRonsSyed Shujaat Ali Zaidi0Masood Ur Rehman Kayani1Xuegong Zhang2Younan Ouyang3Imran Haider Shamsi4Bioinformatics Division, Beijing National Research Institute for Information Science and Technology (BNRIST), Department of Automation, Tsinghua UniversityCenter for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of MedicineBioinformatics Division, Beijing National Research Institute for Information Science and Technology (BNRIST), Department of Automation, Tsinghua UniversityChina National Rice Research Institute (CNRRI)Department of Agronomy, College of Agriculture and Biotechnology, Key Laboratory of Crop Germplasm Resource, Zhejiang UniversityAbstract Background Efficient regulation of bacterial genes in response to the environmental stimulus results in unique gene clusters known as operons. Lack of complete operonic reference and functional information makes the prediction of metagenomic operons a challenging task; thus, opening new perspectives on the interpretation of the host-microbe interactions. Results In this work, we identified whole-genome and metagenomic operons via MetaRon (Metagenome and whole-genome opeRon prediction pipeline). MetaRon identifies operons without any experimental or functional information. MetaRon was implemented on datasets with different levels of complexity and information. Starting from its application on whole-genome to simulated mixture of three whole-genomes (E. coli MG1655, Mycobacterium tuberculosis H37Rv and Bacillus subtilis str. 16), E. coli c20 draft genome extracted from chicken gut and finally on 145 whole-metagenome data samples from human gut. MetaRon consistently achieved high operon prediction sensitivity, specificity and accuracy across E. coli whole-genome (97.8, 94.1 and 92.4%), simulated genome (93.7, 75.5 and 88.1%) and E. coli c20 (87, 91 and 88%,), respectively. Finally, we identified 1,232,407 unique operons from 145 paired-end human gut metagenome samples. We also report strong association of type 2 diabetes with Maltose phosphorylase (K00691), 3-deoxy-D-glycero-D-galacto-nononate 9-phosphate synthase (K21279) and an uncharacterized protein (K07101). Conclusion With MetaRon, we were able to remove two notable limitations of existing whole-genome operon prediction methods: (1) generalizability (ability to predict operons in unrelated bacterial genomes), and (2) whole-genome and metagenomic data management. We also demonstrate the use of operons as a subset to represent the trends of secondary metabolites in whole-metagenome data and the role of secondary metabolites in the occurrence of disease condition. Using operonic data from metagenome to study secondary metabolic trends will significantly reduce the data volume to more precise data. Furthermore, the identification of metabolic pathways associated with the occurrence of type 2 diabetes (T2D) also presents another dimension of analyzing the human gut metagenome. Presumably, this study is the first organized effort to predict metagenomic operons and perform a detailed analysis in association with a disease, in this case type 2 diabetes. The application of MetaRon to metagenomic data at diverse scale will be beneficial to understand the gene regulation and therapeutic metagenomics.https://doi.org/10.1186/s12864-020-07357-5Escherichia coliMetagenomicOperon predictionSecondary metabolitesMicrobiome
spellingShingle Syed Shujaat Ali Zaidi
Masood Ur Rehman Kayani
Xuegong Zhang
Younan Ouyang
Imran Haider Shamsi
Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons
BMC Genomics
Escherichia coli
Metagenomic
Operon prediction
Secondary metabolites
Microbiome
title Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons
title_full Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons
title_fullStr Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons
title_full_unstemmed Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons
title_short Prediction and analysis of metagenomic operons via MetaRon: a pipeline for prediction of Metagenome and whole-genome opeRons
title_sort prediction and analysis of metagenomic operons via metaron a pipeline for prediction of metagenome and whole genome operons
topic Escherichia coli
Metagenomic
Operon prediction
Secondary metabolites
Microbiome
url https://doi.org/10.1186/s12864-020-07357-5
work_keys_str_mv AT syedshujaatalizaidi predictionandanalysisofmetagenomicoperonsviametaronapipelineforpredictionofmetagenomeandwholegenomeoperons
AT masoodurrehmankayani predictionandanalysisofmetagenomicoperonsviametaronapipelineforpredictionofmetagenomeandwholegenomeoperons
AT xuegongzhang predictionandanalysisofmetagenomicoperonsviametaronapipelineforpredictionofmetagenomeandwholegenomeoperons
AT younanouyang predictionandanalysisofmetagenomicoperonsviametaronapipelineforpredictionofmetagenomeandwholegenomeoperons
AT imranhaidershamsi predictionandanalysisofmetagenomicoperonsviametaronapipelineforpredictionofmetagenomeandwholegenomeoperons