Molecular complex detection in protein interaction networks through reinforcement learning

Abstract Background Proteins often assemble into higher-order complexes to perform their biological functions. Such protein–protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be appl...

Full description

Bibliographic Details
Main Authors:	Meghana V. Palukuri, Ridhi S. Patil, Edward M. Marcotte
Format:	Article
Language:	English
Published:	BMC 2023-08-01
Series:	BMC Bioinformatics
Subjects:	Community detection Reinforcement learning Protein complex Protein interactions
Online Access:	https://doi.org/10.1186/s12859-023-05425-7

_version_	1797752624631513088
author	Meghana V. Palukuri Ridhi S. Patil Edward M. Marcotte
author_facet	Meghana V. Palukuri Ridhi S. Patil Edward M. Marcotte
author_sort	Meghana V. Palukuri
collection	DOAJ
description	Abstract Background Proteins often assemble into higher-order complexes to perform their biological functions. Such protein–protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be applied to define the various higher-order protein complexes. Current methods include unsupervised and supervised approaches, often assuming that protein complexes manifest only as dense subgraphs. Utilizing supervised approaches, the focus is not on how to find them in a network, but only on learning which subgraphs correspond to complexes, currently solved using heuristics. However, learning to walk trajectories on a network to identify protein complexes leads naturally to a reinforcement learning (RL) approach, a strategy not extensively explored for community detection. Here, we develop and evaluate a reinforcement learning pipeline for community detection on weighted protein–protein interaction networks to detect new protein complexes. The algorithm is trained to calculate the value of different subgraphs encountered while walking on the network to reconstruct known complexes. A distributed prediction algorithm then scales the RL pipeline to search for novel protein complexes on large PPI networks. Results The reinforcement learning pipeline is applied to a human PPI network consisting of 8k proteins and 60k PPI, which results in 1,157 protein complexes. The method demonstrated competitive accuracy with improved speed compared to previous algorithms. We highlight protein complexes such as C4orf19, C18orf21, and KIAA1522 which are currently minimally characterized. Additionally, the results suggest TMC04 be a putative additional subunit of the KICSTOR complex and confirm the involvement of C15orf41 in a higher-order complex with HIRA, CDAN1, ASF1A, and by 3D structural modeling. Conclusions Reinforcement learning offers several distinct advantages for community detection, including scalability and knowledge of the walk trajectories defining those communities. Applied to currently available human protein interaction networks, this method had comparable accuracy with other algorithms and notable savings in computational time, and in turn, led to clear predictions of protein function and interactions for several uncharacterized human proteins.
first_indexed	2024-03-12T17:06:06Z
format	Article
id	doaj.art-784b5a9af2294bd490f5785292fc0292
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-03-12T17:06:06Z
publishDate	2023-08-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-784b5a9af2294bd490f5785292fc02922023-08-06T11:26:12ZengBMCBMC Bioinformatics1471-21052023-08-0124112710.1186/s12859-023-05425-7Molecular complex detection in protein interaction networks through reinforcement learningMeghana V. Palukuri0Ridhi S. Patil1Edward M. Marcotte2Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of TexasDepartment of Biomedical Engineering, University of TexasDepartment of Molecular Biosciences, Center for Systems and Synthetic Biology, University of TexasAbstract Background Proteins often assemble into higher-order complexes to perform their biological functions. Such protein–protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be applied to define the various higher-order protein complexes. Current methods include unsupervised and supervised approaches, often assuming that protein complexes manifest only as dense subgraphs. Utilizing supervised approaches, the focus is not on how to find them in a network, but only on learning which subgraphs correspond to complexes, currently solved using heuristics. However, learning to walk trajectories on a network to identify protein complexes leads naturally to a reinforcement learning (RL) approach, a strategy not extensively explored for community detection. Here, we develop and evaluate a reinforcement learning pipeline for community detection on weighted protein–protein interaction networks to detect new protein complexes. The algorithm is trained to calculate the value of different subgraphs encountered while walking on the network to reconstruct known complexes. A distributed prediction algorithm then scales the RL pipeline to search for novel protein complexes on large PPI networks. Results The reinforcement learning pipeline is applied to a human PPI network consisting of 8k proteins and 60k PPI, which results in 1,157 protein complexes. The method demonstrated competitive accuracy with improved speed compared to previous algorithms. We highlight protein complexes such as C4orf19, C18orf21, and KIAA1522 which are currently minimally characterized. Additionally, the results suggest TMC04 be a putative additional subunit of the KICSTOR complex and confirm the involvement of C15orf41 in a higher-order complex with HIRA, CDAN1, ASF1A, and by 3D structural modeling. Conclusions Reinforcement learning offers several distinct advantages for community detection, including scalability and knowledge of the walk trajectories defining those communities. Applied to currently available human protein interaction networks, this method had comparable accuracy with other algorithms and notable savings in computational time, and in turn, led to clear predictions of protein function and interactions for several uncharacterized human proteins.https://doi.org/10.1186/s12859-023-05425-7Community detectionReinforcement learningProtein complexProtein interactions
spellingShingle	Meghana V. Palukuri Ridhi S. Patil Edward M. Marcotte Molecular complex detection in protein interaction networks through reinforcement learning BMC Bioinformatics Community detection Reinforcement learning Protein complex Protein interactions
title	Molecular complex detection in protein interaction networks through reinforcement learning
title_full	Molecular complex detection in protein interaction networks through reinforcement learning
title_fullStr	Molecular complex detection in protein interaction networks through reinforcement learning
title_full_unstemmed	Molecular complex detection in protein interaction networks through reinforcement learning
title_short	Molecular complex detection in protein interaction networks through reinforcement learning
title_sort	molecular complex detection in protein interaction networks through reinforcement learning
topic	Community detection Reinforcement learning Protein complex Protein interactions
url	https://doi.org/10.1186/s12859-023-05425-7
work_keys_str_mv	AT meghanavpalukuri molecularcomplexdetectioninproteininteractionnetworksthroughreinforcementlearning AT ridhispatil molecularcomplexdetectioninproteininteractionnetworksthroughreinforcementlearning AT edwardmmarcotte molecularcomplexdetectioninproteininteractionnetworksthroughreinforcementlearning

Molecular complex detection in protein interaction networks through reinforcement learning

Similar Items