PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes

Abstract Background Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganism...

Full description

Bibliographic Details
Main Authors: Jiaxiong Zeng, Qichao Tu, Xiaoli Yu, Lu Qian, Cheng Wang, Longfei Shu, Fei Liu, Shengwei Liu, Zhijian Huang, Jianguo He, Qingyun Yan, Zhili He
Format: Article
Language:English
Published: BMC 2022-07-01
Series:Microbiome
Subjects:
Online Access:https://doi.org/10.1186/s40168-022-01292-1
_version_ 1818158577234214912
author Jiaxiong Zeng
Qichao Tu
Xiaoli Yu
Lu Qian
Cheng Wang
Longfei Shu
Fei Liu
Shengwei Liu
Zhijian Huang
Jianguo He
Qingyun Yan
Zhili He
author_facet Jiaxiong Zeng
Qichao Tu
Xiaoli Yu
Lu Qian
Cheng Wang
Longfei Shu
Fei Liu
Shengwei Liu
Zhijian Huang
Jianguo He
Qingyun Yan
Zhili He
author_sort Jiaxiong Zeng
collection DOAJ
description Abstract Background Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganisms (PCMs) as well as their ecological functions remains elusive even with the rapid advancement of metagenome sequencing technologies. One of major challenges is a lack of a comprehensive and accurately annotated P cycling functional gene database. Results In this study, we constructed a well-curated P cycling database (PCycDB) covering 139 gene families and 10 P metabolic processes, including several previously ignored PCGs such as pafA encoding phosphate-insensitive phosphatase, ptxABCD (phosphite-related genes), and novel aepXVWPS genes for 2-aminoethylphosphonate transporters. We achieved an annotation accuracy, positive predictive value (PPV), sensitivity, specificity, and negative predictive value (NPV) of 99.8%, 96.1%, 99.9%, 99.8%, and 99.9%, respectively, for simulated gene datasets. Compared to other orthology databases, PCycDB is more accurate, more comprehensive, and faster to profile the PCGs. We used PCycDB to analyze P cycling microbial communities from representative natural and engineered environments and showed that PCycDB could apply to different environments. Conclusions We demonstrate that PCycDB is a powerful tool for advancing our understanding of microbially driven P cycling in the environment with high coverage, high accuracy, and rapid analysis of metagenome sequencing data. The PCycDB is available at https://github.com/ZengJiaxiong/Phosphorus-cycling-database . Video Abstract
first_indexed 2024-12-11T15:32:18Z
format Article
id doaj.art-65d2b40b621048058ad5f4cf2d496c14
institution Directory Open Access Journal
issn 2049-2618
language English
last_indexed 2024-12-11T15:32:18Z
publishDate 2022-07-01
publisher BMC
record_format Article
series Microbiome
spelling doaj.art-65d2b40b621048058ad5f4cf2d496c142022-12-22T01:00:01ZengBMCMicrobiome2049-26182022-07-0110111610.1186/s40168-022-01292-1PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genesJiaxiong Zeng0Qichao Tu1Xiaoli Yu2Lu Qian3Cheng Wang4Longfei Shu5Fei Liu6Shengwei Liu7Zhijian Huang8Jianguo He9Qingyun Yan10Zhili He11Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityInstitute of Marine Science and Technology, Shandong UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityEnvironmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen UniversityAbstract Background Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganisms (PCMs) as well as their ecological functions remains elusive even with the rapid advancement of metagenome sequencing technologies. One of major challenges is a lack of a comprehensive and accurately annotated P cycling functional gene database. Results In this study, we constructed a well-curated P cycling database (PCycDB) covering 139 gene families and 10 P metabolic processes, including several previously ignored PCGs such as pafA encoding phosphate-insensitive phosphatase, ptxABCD (phosphite-related genes), and novel aepXVWPS genes for 2-aminoethylphosphonate transporters. We achieved an annotation accuracy, positive predictive value (PPV), sensitivity, specificity, and negative predictive value (NPV) of 99.8%, 96.1%, 99.9%, 99.8%, and 99.9%, respectively, for simulated gene datasets. Compared to other orthology databases, PCycDB is more accurate, more comprehensive, and faster to profile the PCGs. We used PCycDB to analyze P cycling microbial communities from representative natural and engineered environments and showed that PCycDB could apply to different environments. Conclusions We demonstrate that PCycDB is a powerful tool for advancing our understanding of microbially driven P cycling in the environment with high coverage, high accuracy, and rapid analysis of metagenome sequencing data. The PCycDB is available at https://github.com/ZengJiaxiong/Phosphorus-cycling-database . Video Abstracthttps://doi.org/10.1186/s40168-022-01292-1Phosphorus cycling gene/microorganismDatabaseAccuracyComprehensivenessMetagenome sequencing data
spellingShingle Jiaxiong Zeng
Qichao Tu
Xiaoli Yu
Lu Qian
Cheng Wang
Longfei Shu
Fei Liu
Shengwei Liu
Zhijian Huang
Jianguo He
Qingyun Yan
Zhili He
PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
Microbiome
Phosphorus cycling gene/microorganism
Database
Accuracy
Comprehensiveness
Metagenome sequencing data
title PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
title_full PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
title_fullStr PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
title_full_unstemmed PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
title_short PCycDB: a comprehensive and accurate database for fast analysis of phosphorus cycling genes
title_sort pcycdb a comprehensive and accurate database for fast analysis of phosphorus cycling genes
topic Phosphorus cycling gene/microorganism
Database
Accuracy
Comprehensiveness
Metagenome sequencing data
url https://doi.org/10.1186/s40168-022-01292-1
work_keys_str_mv AT jiaxiongzeng pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT qichaotu pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT xiaoliyu pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT luqian pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT chengwang pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT longfeishu pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT feiliu pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT shengweiliu pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT zhijianhuang pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT jianguohe pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT qingyunyan pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes
AT zhilihe pcycdbacomprehensiveandaccuratedatabaseforfastanalysisofphosphoruscyclinggenes