Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks

Abstract Background Protein phosphorylation networks play an important role in cell signaling. In these networks, phosphorylation of a protein kinase usually leads to its activation, which in turn will phosphorylate its downstream target proteins. A phosphorylation network is essentially a causal ne...

Full description

Bibliographic Details
Main Authors: Jinling Liu, Xiaojun Ma, Gregory F. Cooper, Xinghua Lu
Format: Article
Language:English
Published: BMC 2020-09-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03676-2
_version_ 1818971086335770624
author Jinling Liu
Xiaojun Ma
Gregory F. Cooper
Xinghua Lu
author_facet Jinling Liu
Xiaojun Ma
Gregory F. Cooper
Xinghua Lu
author_sort Jinling Liu
collection DOAJ
description Abstract Background Protein phosphorylation networks play an important role in cell signaling. In these networks, phosphorylation of a protein kinase usually leads to its activation, which in turn will phosphorylate its downstream target proteins. A phosphorylation network is essentially a causal network, which can be learned by causal inference algorithms. Prior efforts have applied such algorithms to data measuring protein phosphorylation levels, assuming that the phosphorylation levels represent protein activity states. However, the phosphorylation status of a kinase does not always reflect its activity state, because interventions such as inhibitors or mutations can directly affect its activity state without changing its phosphorylation status. Thus, when cellular systems are subjected to extensive perturbations, the statistical relationships between phosphorylation states of proteins may be disrupted, making it difficult to reconstruct the true protein phosphorylation network. Here, we describe a novel framework to address this challenge. Results We have developed a causal discovery framework that explicitly represents the activity state of each protein kinase as an unmeasured variable and developed a novel algorithm called “InferA” to infer the protein activity states, which allows us to incorporate the protein phosphorylation level, pharmacological interventions and prior knowledge. We applied our framework to simulated datasets and to a real-world dataset. The simulation experiments demonstrated that explicit representation of activity states of protein kinases allows one to effectively represent the impact of interventions and thus enabled our framework to accurately recover the ground-truth causal network. Results from the real-world dataset showed that the explicit representation of protein activity states allowed an effective and data-driven integration of the prior knowledge by InferA, which further leads to the recovery of a phosphorylation network that is more consistent with experiment results. Conclusions Explicit representation of the protein activity states by our novel framework significantly enhances causal discovery of protein phosphorylation networks.
first_indexed 2024-12-20T14:46:47Z
format Article
id doaj.art-83442629a4d64e3a9be956607447f42c
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-20T14:46:47Z
publishDate 2020-09-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-83442629a4d64e3a9be956607447f42c2022-12-21T19:37:06ZengBMCBMC Bioinformatics1471-21052020-09-0121S1311710.1186/s12859-020-03676-2Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networksJinling Liu0Xiaojun Ma1Gregory F. Cooper2Xinghua Lu3Department of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghDepartment of Biomedical Informatics, University of PittsburghAbstract Background Protein phosphorylation networks play an important role in cell signaling. In these networks, phosphorylation of a protein kinase usually leads to its activation, which in turn will phosphorylate its downstream target proteins. A phosphorylation network is essentially a causal network, which can be learned by causal inference algorithms. Prior efforts have applied such algorithms to data measuring protein phosphorylation levels, assuming that the phosphorylation levels represent protein activity states. However, the phosphorylation status of a kinase does not always reflect its activity state, because interventions such as inhibitors or mutations can directly affect its activity state without changing its phosphorylation status. Thus, when cellular systems are subjected to extensive perturbations, the statistical relationships between phosphorylation states of proteins may be disrupted, making it difficult to reconstruct the true protein phosphorylation network. Here, we describe a novel framework to address this challenge. Results We have developed a causal discovery framework that explicitly represents the activity state of each protein kinase as an unmeasured variable and developed a novel algorithm called “InferA” to infer the protein activity states, which allows us to incorporate the protein phosphorylation level, pharmacological interventions and prior knowledge. We applied our framework to simulated datasets and to a real-world dataset. The simulation experiments demonstrated that explicit representation of activity states of protein kinases allows one to effectively represent the impact of interventions and thus enabled our framework to accurately recover the ground-truth causal network. Results from the real-world dataset showed that the explicit representation of protein activity states allowed an effective and data-driven integration of the prior knowledge by InferA, which further leads to the recovery of a phosphorylation network that is more consistent with experiment results. Conclusions Explicit representation of the protein activity states by our novel framework significantly enhances causal discovery of protein phosphorylation networks.http://link.springer.com/article/10.1186/s12859-020-03676-2Causal inferenceProtein kinase activity stateProtein phosphorylation networksCancer signaling pathways
spellingShingle Jinling Liu
Xiaojun Ma
Gregory F. Cooper
Xinghua Lu
Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
BMC Bioinformatics
Causal inference
Protein kinase activity state
Protein phosphorylation networks
Cancer signaling pathways
title Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
title_full Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
title_fullStr Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
title_full_unstemmed Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
title_short Explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
title_sort explicit representation of protein activity states significantly improves causal discovery of protein phosphorylation networks
topic Causal inference
Protein kinase activity state
Protein phosphorylation networks
Cancer signaling pathways
url http://link.springer.com/article/10.1186/s12859-020-03676-2
work_keys_str_mv AT jinlingliu explicitrepresentationofproteinactivitystatessignificantlyimprovescausaldiscoveryofproteinphosphorylationnetworks
AT xiaojunma explicitrepresentationofproteinactivitystatessignificantlyimprovescausaldiscoveryofproteinphosphorylationnetworks
AT gregoryfcooper explicitrepresentationofproteinactivitystatessignificantlyimprovescausaldiscoveryofproteinphosphorylationnetworks
AT xinghualu explicitrepresentationofproteinactivitystatessignificantlyimprovescausaldiscoveryofproteinphosphorylationnetworks