PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
Abstract Protein–peptide interactions play a crucial role in various cellular processes and are implicated in abnormal cellular behaviors leading to diseases such as cancer. Therefore, understanding these interactions is vital for both functional genomics and drug discovery efforts. Despite a signif...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-11-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-023-47624-5 |
_version_ | 1797415319521722368 |
---|---|
author | Abel Chandra Alok Sharma Iman Dehzangi Tatsuhiko Tsunoda Abdul Sattar |
author_facet | Abel Chandra Alok Sharma Iman Dehzangi Tatsuhiko Tsunoda Abdul Sattar |
author_sort | Abel Chandra |
collection | DOAJ |
description | Abstract Protein–peptide interactions play a crucial role in various cellular processes and are implicated in abnormal cellular behaviors leading to diseases such as cancer. Therefore, understanding these interactions is vital for both functional genomics and drug discovery efforts. Despite a significant increase in the availability of protein–peptide complexes, experimental methods for studying these interactions remain laborious, time-consuming, and expensive. Computational methods offer a complementary approach but often fall short in terms of prediction accuracy. To address these challenges, we introduce PepCNN, a deep learning-based prediction model that incorporates structural and sequence-based information from primary protein sequences. By utilizing a combination of half-sphere exposure, position specific scoring matrices from multiple-sequence alignment tool, and embedding from a pre-trained protein language model, PepCNN outperforms state-of-the-art methods in terms of specificity, precision, and AUC. The PepCNN software and datasets are publicly available at https://github.com/abelavit/PepCNN.git . |
first_indexed | 2024-03-09T05:45:59Z |
format | Article |
id | doaj.art-4dd717e9a5714b72bd89267464af23c1 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-03-09T05:45:59Z |
publishDate | 2023-11-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-4dd717e9a5714b72bd89267464af23c12023-12-03T12:20:53ZengNature PortfolioScientific Reports2045-23222023-11-0113111410.1038/s41598-023-47624-5PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model featuresAbel Chandra0Alok Sharma1Iman Dehzangi2Tatsuhiko Tsunoda3Abdul Sattar4Institute for Integrated and Intelligent Systems, Griffith UniversityInstitute for Integrated and Intelligent Systems, Griffith UniversityDepartment of Computer Science, Rutgers UniversityLaboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of TokyoInstitute for Integrated and Intelligent Systems, Griffith UniversityAbstract Protein–peptide interactions play a crucial role in various cellular processes and are implicated in abnormal cellular behaviors leading to diseases such as cancer. Therefore, understanding these interactions is vital for both functional genomics and drug discovery efforts. Despite a significant increase in the availability of protein–peptide complexes, experimental methods for studying these interactions remain laborious, time-consuming, and expensive. Computational methods offer a complementary approach but often fall short in terms of prediction accuracy. To address these challenges, we introduce PepCNN, a deep learning-based prediction model that incorporates structural and sequence-based information from primary protein sequences. By utilizing a combination of half-sphere exposure, position specific scoring matrices from multiple-sequence alignment tool, and embedding from a pre-trained protein language model, PepCNN outperforms state-of-the-art methods in terms of specificity, precision, and AUC. The PepCNN software and datasets are publicly available at https://github.com/abelavit/PepCNN.git .https://doi.org/10.1038/s41598-023-47624-5 |
spellingShingle | Abel Chandra Alok Sharma Iman Dehzangi Tatsuhiko Tsunoda Abdul Sattar PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features Scientific Reports |
title | PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features |
title_full | PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features |
title_fullStr | PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features |
title_full_unstemmed | PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features |
title_short | PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features |
title_sort | pepcnn deep learning tool for predicting peptide binding residues in proteins using sequence structural and language model features |
url | https://doi.org/10.1038/s41598-023-47624-5 |
work_keys_str_mv | AT abelchandra pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures AT aloksharma pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures AT imandehzangi pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures AT tatsuhikotsunoda pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures AT abdulsattar pepcnndeeplearningtoolforpredictingpeptidebindingresiduesinproteinsusingsequencestructuralandlanguagemodelfeatures |