A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions

Protein–RNA interactions (PRIs) are essential for many biological processes, so understanding aspects of the sequences and structures involved in PRIs is important for unraveling such processes. Because of the expensive and time-consuming techniques required for experimental determination of complex...

Full description

Bibliographic Details
Main Authors: Shunya Kashiwagi, Kengo Sato, Yasubumi Sakakibara
Format: Article
Language:English
Published: MDPI AG 2021-10-01
Series:Life
Subjects:
Online Access:https://www.mdpi.com/2075-1729/11/11/1135
_version_ 1797509645847232512
author Shunya Kashiwagi
Kengo Sato
Yasubumi Sakakibara
author_facet Shunya Kashiwagi
Kengo Sato
Yasubumi Sakakibara
author_sort Shunya Kashiwagi
collection DOAJ
description Protein–RNA interactions (PRIs) are essential for many biological processes, so understanding aspects of the sequences and structures involved in PRIs is important for unraveling such processes. Because of the expensive and time-consuming techniques required for experimental determination of complex protein–RNA structures, various computational methods have been developed to predict PRIs. However, most of these methods focus on predicting only RNA-binding regions in proteins or only protein-binding motifs in RNA. Methods for predicting entire residue–base contacts in PRIs have not yet achieved sufficient accuracy. Furthermore, some of these methods require the identification of 3D structures or homologous sequences, which are not available for all protein and RNA sequences. Here, we propose a prediction method for predicting residue–base contacts between proteins and RNAs using only sequence information and structural information predicted from sequences. The method can be applied to any protein–RNA pair, even when rich information such as its 3D structure, is not available. In this method, residue–base contact prediction is formalized as an integer programming problem. We predict a residue–base contact map that maximizes a scoring function based on sequence-based features such as <i>k</i>-mers of sequences and the predicted secondary structure. The scoring function is trained using a max-margin framework from known PRIs with 3D structures. To verify our method, we conducted several computational experiments. The results suggest that our method, which is based on only sequence information, is comparable with RNA-binding residue prediction methods based on known binding data.
first_indexed 2024-03-10T05:20:42Z
format Article
id doaj.art-6bfbdfd3f4b145c8a868ca278aa2370e
institution Directory Open Access Journal
issn 2075-1729
language English
last_indexed 2024-03-10T05:20:42Z
publishDate 2021-10-01
publisher MDPI AG
record_format Article
series Life
spelling doaj.art-6bfbdfd3f4b145c8a868ca278aa2370e2023-11-23T00:03:19ZengMDPI AGLife2075-17292021-10-011111113510.3390/life11111135A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA InteractionsShunya Kashiwagi0Kengo Sato1Yasubumi Sakakibara2Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, JapanDepartment of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, JapanDepartment of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, JapanProtein–RNA interactions (PRIs) are essential for many biological processes, so understanding aspects of the sequences and structures involved in PRIs is important for unraveling such processes. Because of the expensive and time-consuming techniques required for experimental determination of complex protein–RNA structures, various computational methods have been developed to predict PRIs. However, most of these methods focus on predicting only RNA-binding regions in proteins or only protein-binding motifs in RNA. Methods for predicting entire residue–base contacts in PRIs have not yet achieved sufficient accuracy. Furthermore, some of these methods require the identification of 3D structures or homologous sequences, which are not available for all protein and RNA sequences. Here, we propose a prediction method for predicting residue–base contacts between proteins and RNAs using only sequence information and structural information predicted from sequences. The method can be applied to any protein–RNA pair, even when rich information such as its 3D structure, is not available. In this method, residue–base contact prediction is formalized as an integer programming problem. We predict a residue–base contact map that maximizes a scoring function based on sequence-based features such as <i>k</i>-mers of sequences and the predicted secondary structure. The scoring function is trained using a max-margin framework from known PRIs with 3D structures. To verify our method, we conducted several computational experiments. The results suggest that our method, which is based on only sequence information, is comparable with RNA-binding residue prediction methods based on known binding data.https://www.mdpi.com/2075-1729/11/11/1135protein–RNA interactionRNA secondary structurestructured support vector machine
spellingShingle Shunya Kashiwagi
Kengo Sato
Yasubumi Sakakibara
A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions
Life
protein–RNA interaction
RNA secondary structure
structured support vector machine
title A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions
title_full A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions
title_fullStr A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions
title_full_unstemmed A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions
title_short A Max-Margin Model for Predicting Residue—Base Contacts in Protein–RNA Interactions
title_sort max margin model for predicting residue base contacts in protein rna interactions
topic protein–RNA interaction
RNA secondary structure
structured support vector machine
url https://www.mdpi.com/2075-1729/11/11/1135
work_keys_str_mv AT shunyakashiwagi amaxmarginmodelforpredictingresiduebasecontactsinproteinrnainteractions
AT kengosato amaxmarginmodelforpredictingresiduebasecontactsinproteinrnainteractions
AT yasubumisakakibara amaxmarginmodelforpredictingresiduebasecontactsinproteinrnainteractions
AT shunyakashiwagi maxmarginmodelforpredictingresiduebasecontactsinproteinrnainteractions
AT kengosato maxmarginmodelforpredictingresiduebasecontactsinproteinrnainteractions
AT yasubumisakakibara maxmarginmodelforpredictingresiduebasecontactsinproteinrnainteractions