PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features

Abstract Protein–peptide interactions play a crucial role in various cellular processes and are implicated in abnormal cellular behaviors leading to diseases such as cancer. Therefore, understanding these interactions is vital for both functional genomics and drug discovery efforts. Despite a signif...

Full description

Bibliographic Details
Main Authors: Abel Chandra, Alok Sharma, Iman Dehzangi, Tatsuhiko Tsunoda, Abdul Sattar
Format: Article
Language:English
Published: Nature Portfolio 2023-11-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-47624-5
_version_ 1797415319521722368
author Abel Chandra
Alok Sharma
Iman Dehzangi
Tatsuhiko Tsunoda
Abdul Sattar
author_facet Abel Chandra
Alok Sharma
Iman Dehzangi
Tatsuhiko Tsunoda
Abdul Sattar
author_sort Abel Chandra
collection DOAJ
description Abstract Protein–peptide interactions play a crucial role in various cellular processes and are implicated in abnormal cellular behaviors leading to diseases such as cancer. Therefore, understanding these interactions is vital for both functional genomics and drug discovery efforts. Despite a significant increase in the availability of protein–peptide complexes, experimental methods for studying these interactions remain laborious, time-consuming, and expensive. Computational methods offer a complementary approach but often fall short in terms of prediction accuracy. To address these challenges, we introduce PepCNN, a deep learning-based prediction model that incorporates structural and sequence-based information from primary protein sequences. By utilizing a combination of half-sphere exposure, position specific scoring matrices from multiple-sequence alignment tool, and embedding from a pre-trained protein language model, PepCNN outperforms state-of-the-art methods in terms of specificity, precision, and AUC. The PepCNN software and datasets are publicly available at https://github.com/abelavit/PepCNN.git .
first_indexed 2024-03-09T05:45:59Z
format Article
id doaj.art-4dd717e9a5714b72bd89267464af23c1
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-09T05:45:59Z
publishDate 2023-11-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-4dd717e9a5714b72bd89267464af23c12023-12-03T12:20:53ZengNature PortfolioScientific Reports2045-23222023-11-0113111410.1038/s41598-023-47624-5PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model featuresAbel Chandra0Alok Sharma1Iman Dehzangi2Tatsuhiko Tsunoda3Abdul Sattar4Institute for Integrated and Intelligent Systems, Griffith UniversityInstitute for Integrated and Intelligent Systems, Griffith UniversityDepartment of Computer Science, Rutgers UniversityLaboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of TokyoInstitute for Integrated and Intelligent Systems, Griffith UniversityAbstract Protein–peptide interactions play a crucial role in various cellular processes and are implicated in abnormal cellular behaviors leading to diseases such as cancer. Therefore, understanding these interactions is vital for both functional genomics and drug discovery efforts. Despite a significant increase in the availability of protein–peptide complexes, experimental methods for studying these interactions remain laborious, time-consuming, and expensive. Computational methods offer a complementary approach but often fall short in terms of prediction accuracy. To address these challenges, we introduce PepCNN, a deep learning-based prediction model that incorporates structural and sequence-based information from primary protein sequences. By utilizing a combination of half-sphere exposure, position specific scoring matrices from multiple-sequence alignment tool, and embedding from a pre-trained protein language model, PepCNN outperforms state-of-the-art methods in terms of specificity, precision, and AUC. The PepCNN software and datasets are publicly available at https://github.com/abelavit/PepCNN.git .https://doi.org/10.1038/s41598-023-47624-5
spellingShingle Abel Chandra
Alok Sharma
Iman Dehzangi
Tatsuhiko Tsunoda
Abdul Sattar
PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
Scientific Reports
title PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
title_full PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
title_fullStr PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
title_full_unstemmed PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
title_short PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
title_sort pepcnn deep learning tool for predicting peptide binding residues in proteins using sequence structural and language model features
url https://doi.org/10.1038/s41598-023-47624-5
work_keys_str_mv AT abelchandra pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures
AT aloksharma pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures
AT imandehzangi pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures
AT tatsuhikotsunoda pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures
AT abdulsattar pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures