SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
MotivationS-adenosyl-L-methionine (SAM) is an essential cofactor present in the biological system and plays a key role in many diseases. There is a need to develop a method for predicting SAM binding sites in a protein for designing drugs against SAM associated disease. To the best of our knowledge,...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-01-01
|
Series: | Frontiers in Pharmacology |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fphar.2019.01690/full |
_version_ | 1798046009401540608 |
---|---|
author | Piyush Agrawal Piyush Agrawal Gaurav Mishra Gaurav Mishra Gajendra P. S. Raghava |
author_facet | Piyush Agrawal Piyush Agrawal Gaurav Mishra Gaurav Mishra Gajendra P. S. Raghava |
author_sort | Piyush Agrawal |
collection | DOAJ |
description | MotivationS-adenosyl-L-methionine (SAM) is an essential cofactor present in the biological system and plays a key role in many diseases. There is a need to develop a method for predicting SAM binding sites in a protein for designing drugs against SAM associated disease. To the best of our knowledge, there is no method that can predict the binding site of SAM in a given protein sequence.ResultThis manuscript describes a method SAMbinder, developed for predicting SAM interacting residue in a protein from its primary sequence. All models were trained, tested, and evaluated on 145 SAM binding protein chains where no two chains have more than 40% sequence similarity. Firstly, models were developed using different machine learning techniques on a balanced data set containing 2,188 SAM interacting and an equal number of non-interacting residues. Our random forest based model developed using binary profile feature got maximum Matthews Correlation Coefficient (MCC) 0.42 with area under receiver operating characteristics (AUROC) 0.79 on the validation data set. The performance of our models improved significantly from MCC 0.42 to 0.61, when evolutionary information in the form of the position-specific scoring matrix (PSSM) profile is used as a feature. We also developed models on a realistic data set containing 2,188 SAM interacting and 40,029 non-interacting residues and got maximum MCC 0.61 with AUROC of 0.89. In order to evaluate the performance of our models, we used internal as well as external cross-validation technique.Availability and Implementationhttps://webs.iiitd.edu.in/raghava/sambinder/. |
first_indexed | 2024-04-11T23:30:34Z |
format | Article |
id | doaj.art-d9ff6eb8c0914658b38cb6f297a84dc5 |
institution | Directory Open Access Journal |
issn | 1663-9812 |
language | English |
last_indexed | 2024-04-11T23:30:34Z |
publishDate | 2020-01-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Pharmacology |
spelling | doaj.art-d9ff6eb8c0914658b38cb6f297a84dc52022-12-22T03:57:10ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122020-01-011010.3389/fphar.2019.01690497036SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid SequencePiyush Agrawal0Piyush Agrawal1Gaurav Mishra2Gaurav Mishra3Gajendra P. S. Raghava4Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, IndiaBioinformatics Center, CSIR-Institute of Microbial Technology, Chandigarh, IndiaDepartment of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, IndiaDepartment of Electrical Engineering, Shiv Nadar University, Greater Noida, IndiaDepartment of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, IndiaMotivationS-adenosyl-L-methionine (SAM) is an essential cofactor present in the biological system and plays a key role in many diseases. There is a need to develop a method for predicting SAM binding sites in a protein for designing drugs against SAM associated disease. To the best of our knowledge, there is no method that can predict the binding site of SAM in a given protein sequence.ResultThis manuscript describes a method SAMbinder, developed for predicting SAM interacting residue in a protein from its primary sequence. All models were trained, tested, and evaluated on 145 SAM binding protein chains where no two chains have more than 40% sequence similarity. Firstly, models were developed using different machine learning techniques on a balanced data set containing 2,188 SAM interacting and an equal number of non-interacting residues. Our random forest based model developed using binary profile feature got maximum Matthews Correlation Coefficient (MCC) 0.42 with area under receiver operating characteristics (AUROC) 0.79 on the validation data set. The performance of our models improved significantly from MCC 0.42 to 0.61, when evolutionary information in the form of the position-specific scoring matrix (PSSM) profile is used as a feature. We also developed models on a realistic data set containing 2,188 SAM interacting and 40,029 non-interacting residues and got maximum MCC 0.61 with AUROC of 0.89. In order to evaluate the performance of our models, we used internal as well as external cross-validation technique.Availability and Implementationhttps://webs.iiitd.edu.in/raghava/sambinder/.https://www.frontiersin.org/article/10.3389/fphar.2019.01690/fullS-adenosine-L-methioninePSSM profilein silico predictioncancermachine learning technique (MLT) |
spellingShingle | Piyush Agrawal Piyush Agrawal Gaurav Mishra Gaurav Mishra Gajendra P. S. Raghava SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence Frontiers in Pharmacology S-adenosine-L-methionine PSSM profile in silico prediction cancer machine learning technique (MLT) |
title | SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence |
title_full | SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence |
title_fullStr | SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence |
title_full_unstemmed | SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence |
title_short | SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence |
title_sort | sambinder a web server for predicting s adenosyl l methionine binding residues of a protein from its amino acid sequence |
topic | S-adenosine-L-methionine PSSM profile in silico prediction cancer machine learning technique (MLT) |
url | https://www.frontiersin.org/article/10.3389/fphar.2019.01690/full |
work_keys_str_mv | AT piyushagrawal sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence AT piyushagrawal sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence AT gauravmishra sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence AT gauravmishra sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence AT gajendrapsraghava sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence |