A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites

Abstract Protein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas re...

Full description

Bibliographic Details
Main Authors: Niraj Thapa, Meenal Chaudhari, Anthony A. Iannetta, Clarence White, Kaushik Roy, Robert H. Newman, Leslie M. Hicks, Dukka B. KC
Format: Article
Language:English
Published: Nature Portfolio 2021-06-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-91840-w
_version_ 1819122470797443072
author Niraj Thapa
Meenal Chaudhari
Anthony A. Iannetta
Clarence White
Kaushik Roy
Robert H. Newman
Leslie M. Hicks
Dukka B. KC
author_facet Niraj Thapa
Meenal Chaudhari
Anthony A. Iannetta
Clarence White
Kaushik Roy
Robert H. Newman
Leslie M. Hicks
Dukka B. KC
author_sort Niraj Thapa
collection DOAJ
description Abstract Protein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC–MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.
first_indexed 2024-12-22T06:52:58Z
format Article
id doaj.art-763a7ff34700418ebf40a11c4465bd6e
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-12-22T06:52:58Z
publishDate 2021-06-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-763a7ff34700418ebf40a11c4465bd6e2022-12-21T18:35:04ZengNature PortfolioScientific Reports2045-23222021-06-0111111210.1038/s41598-021-91840-wA deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sitesNiraj Thapa0Meenal Chaudhari1Anthony A. Iannetta2Clarence White3Kaushik Roy4Robert H. Newman5Leslie M. Hicks6Dukka B. KC7Department of Computational Data Science and Engineering, North Carolina A&T State UniversityDepartment of Computational Data Science and Engineering, North Carolina A&T State UniversityDepartment of Chemistry, University of North Carolina at Chapel HillDepartment of Computational Data Science and Engineering, North Carolina A&T State UniversityDepartment of Computer Science, North Carolina A&T State UniversityDepartment of Biology, North Carolina A&T State UniversityDepartment of Chemistry, University of North Carolina at Chapel HillElectrical Engineering and Computer Science Department, Wichita State UniversityAbstract Protein phosphorylation, which is one of the most important post-translational modifications (PTMs), is involved in regulating myriad cellular processes. Herein, we present a novel deep learning based approach for organism-specific protein phosphorylation site prediction in Chlamydomonas reinhardtii, a model algal phototroph. An ensemble model combining convolutional neural networks and long short-term memory (LSTM) achieves the best performance in predicting phosphorylation sites in C. reinhardtii. Deemed Chlamy-EnPhosSite, the measured best AUC and MCC are 0.90 and 0.64 respectively for a combined dataset of serine (S) and threonine (T) in independent testing higher than those measures for other predictors. When applied to the entire C. reinhardtii proteome (totaling 1,809,304 S and T sites), Chlamy-EnPhosSite yielded 499,411 phosphorylated sites with a cut-off value of 0.5 and 237,949 phosphorylated sites with a cut-off value of 0.7. These predictions were compared to an experimental dataset of phosphosites identified by liquid chromatography-tandem mass spectrometry (LC–MS/MS) in a blinded study and approximately 89.69% of 2,663 C. reinhardtii S and T phosphorylation sites were successfully predicted by Chlamy-EnPhosSite at a probability cut-off of 0.5 and 76.83% of sites were successfully identified at a more stringent 0.7 cut-off. Interestingly, Chlamy-EnPhosSite also successfully predicted experimentally confirmed phosphorylation sites in a protein sequence (e.g., RPS6 S245) which did not appear in the training dataset, highlighting prediction accuracy and the power of leveraging predictions to identify biologically relevant PTM sites. These results demonstrate that our method represents a robust and complementary technique for high-throughput phosphorylation site prediction in C. reinhardtii. It has potential to serve as a useful tool to the community. Chlamy-EnPhosSite will contribute to the understanding of how protein phosphorylation influences various biological processes in this important model microalga.https://doi.org/10.1038/s41598-021-91840-w
spellingShingle Niraj Thapa
Meenal Chaudhari
Anthony A. Iannetta
Clarence White
Kaushik Roy
Robert H. Newman
Leslie M. Hicks
Dukka B. KC
A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites
Scientific Reports
title A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites
title_full A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites
title_fullStr A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites
title_full_unstemmed A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites
title_short A deep learning based approach for prediction of Chlamydomonas reinhardtii phosphorylation sites
title_sort deep learning based approach for prediction of chlamydomonas reinhardtii phosphorylation sites
url https://doi.org/10.1038/s41598-021-91840-w
work_keys_str_mv AT nirajthapa adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT meenalchaudhari adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT anthonyaiannetta adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT clarencewhite adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT kaushikroy adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT roberthnewman adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT lesliemhicks adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT dukkabkc adeeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT nirajthapa deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT meenalchaudhari deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT anthonyaiannetta deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT clarencewhite deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT kaushikroy deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT roberthnewman deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT lesliemhicks deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites
AT dukkabkc deeplearningbasedapproachforpredictionofchlamydomonasreinhardtiiphosphorylationsites