Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration

Abstract Protein–protein interactions (PPIs) play essential roles in most biological processes. The binding interfaces between interacting proteins impose evolutionary constraints that have successfully been employed to predict PPIs from multiple sequence alignments (MSAs). To construct MSAs, critic...

Full description

Bibliographic Details
Main Authors: Tao Fang, Damian Szklarczyk, Radja Hachilif, Christian von Mering
Format: Article
Language:English
Published: Nature Portfolio 2024-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-55655-9
_version_ 1797259348434485248
author Tao Fang
Damian Szklarczyk
Radja Hachilif
Christian von Mering
author_facet Tao Fang
Damian Szklarczyk
Radja Hachilif
Christian von Mering
author_sort Tao Fang
collection DOAJ
description Abstract Protein–protein interactions (PPIs) play essential roles in most biological processes. The binding interfaces between interacting proteins impose evolutionary constraints that have successfully been employed to predict PPIs from multiple sequence alignments (MSAs). To construct MSAs, critical choices have to be made: how to ensure the reliable identification of orthologs, and how to optimally balance the need for large alignments versus sufficient alignment quality. Here, we propose a divide-and-conquer strategy for MSA generation: instead of building a single, large alignment for each protein, multiple distinct alignments are constructed under distinct clades in the tree of life. Coevolutionary signals are searched separately within these clades, and are only subsequently integrated using machine learning techniques. We find that this strategy markedly improves overall prediction performance, concomitant with better alignment quality. Using the popular DCA algorithm to systematically search pairs of such alignments, a genome-wide all-against-all interaction scan in a bacterial genome is demonstrated. Given the recent successes of AlphaFold in predicting direct PPIs at atomic detail, a discover-and-refine approach is proposed: our method could provide a fast and accurate strategy for pre-screening the entire genome, submitting to AlphaFold only promising interaction candidates—thus reducing false positives as well as computation time.
first_indexed 2024-04-24T23:08:00Z
format Article
id doaj.art-4aa43a5ba80f4728a24bce72ba0cab7a
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-24T23:08:00Z
publishDate 2024-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-4aa43a5ba80f4728a24bce72ba0cab7a2024-03-17T12:21:35ZengNature PortfolioScientific Reports2045-23222024-03-0114111710.1038/s41598-024-55655-9Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integrationTao Fang0Damian Szklarczyk1Radja Hachilif2Christian von Mering3Department of Molecular Life Sciences, University of ZurichDepartment of Molecular Life Sciences, University of ZurichDepartment of Molecular Life Sciences, University of ZurichDepartment of Molecular Life Sciences, University of ZurichAbstract Protein–protein interactions (PPIs) play essential roles in most biological processes. The binding interfaces between interacting proteins impose evolutionary constraints that have successfully been employed to predict PPIs from multiple sequence alignments (MSAs). To construct MSAs, critical choices have to be made: how to ensure the reliable identification of orthologs, and how to optimally balance the need for large alignments versus sufficient alignment quality. Here, we propose a divide-and-conquer strategy for MSA generation: instead of building a single, large alignment for each protein, multiple distinct alignments are constructed under distinct clades in the tree of life. Coevolutionary signals are searched separately within these clades, and are only subsequently integrated using machine learning techniques. We find that this strategy markedly improves overall prediction performance, concomitant with better alignment quality. Using the popular DCA algorithm to systematically search pairs of such alignments, a genome-wide all-against-all interaction scan in a bacterial genome is demonstrated. Given the recent successes of AlphaFold in predicting direct PPIs at atomic detail, a discover-and-refine approach is proposed: our method could provide a fast and accurate strategy for pre-screening the entire genome, submitting to AlphaFold only promising interaction candidates—thus reducing false positives as well as computation time.https://doi.org/10.1038/s41598-024-55655-9
spellingShingle Tao Fang
Damian Szklarczyk
Radja Hachilif
Christian von Mering
Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
Scientific Reports
title Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
title_full Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
title_fullStr Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
title_full_unstemmed Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
title_short Enhancing coevolutionary signals in protein–protein interaction prediction through clade-wise alignment integration
title_sort enhancing coevolutionary signals in protein protein interaction prediction through clade wise alignment integration
url https://doi.org/10.1038/s41598-024-55655-9
work_keys_str_mv AT taofang enhancingcoevolutionarysignalsinproteinproteininteractionpredictionthroughcladewisealignmentintegration
AT damianszklarczyk enhancingcoevolutionarysignalsinproteinproteininteractionpredictionthroughcladewisealignmentintegration
AT radjahachilif enhancingcoevolutionarysignalsinproteinproteininteractionpredictionthroughcladewisealignmentintegration
AT christianvonmering enhancingcoevolutionarysignalsinproteinproteininteractionpredictionthroughcladewisealignmentintegration