iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks

Abstract Background Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequenc...

Full description

Bibliographic Details
Main Authors: Binh P. Nguyen, Quang H. Nguyen, Giang-Nam Doan-Ngoc, Thanh-Hoang Nguyen-Vo, Susanto Rahardja
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-019-3295-2
_version_ 1819056936680685568
author Binh P. Nguyen
Quang H. Nguyen
Giang-Nam Doan-Ngoc
Thanh-Hoang Nguyen-Vo
Susanto Rahardja
author_facet Binh P. Nguyen
Quang H. Nguyen
Giang-Nam Doan-Ngoc
Thanh-Hoang Nguyen-Vo
Susanto Rahardja
author_sort Binh P. Nguyen
collection DOAJ
description Abstract Background Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequences have expanded very fast. In this study, we propose iProDNA-CapsNet – a new prediction model identifying protein-DNA binding residues using an ensemble of capsule neural networks (CapsNets) on position specific scoring matrix (PSMM) profiles. The use of CapsNets promises an innovative approach to determine the location of DNA-binding residues. In this study, the benchmark datasets introduced by Hu et al. (2017), i.e., PDNA-543 and PDNA-TEST, were used to train and evaluate the model, respectively. To fairly assess the model performance, comparative analysis between iProDNA-CapsNet and existing state-of-the-art methods was done. Results Under the decision threshold corresponding to false positive rate (FPR) ≈ 5%, the accuracy, sensitivity, precision, and Matthews’s correlation coefficient (MCC) of our model is increased by about 2.0%, 2.0%, 14.0%, and 5.0% with respect to TargetDNA (Hu et al., 2017) and 1.0%, 75.0%, 45.0%, and 77.0% with respect to BindN+ (Wang et al., 2010), respectively. With regards to other methods not reporting their threshold settings, iProDNA-CapsNet also shows a significant improvement in performance based on most of the evaluation metrics. Even with different patterns of change among the models, iProDNA-CapsNets remains to be the best model having top performance in most of the metrics, especially MCC which is boosted from about 8.0% to 220.0%. Conclusions According to all evaluation metrics under various decision thresholds, iProDNA-CapsNet shows better performance compared to the two current best models (BindN and TargetDNA). Our proposed approach also shows that CapsNet can potentially be used and adopted in other biological applications.
first_indexed 2024-12-21T13:31:20Z
format Article
id doaj.art-b244f3143ccb458bad20a7f0815ba542
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-21T13:31:20Z
publishDate 2019-12-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-b244f3143ccb458bad20a7f0815ba5422022-12-21T19:02:18ZengBMCBMC Bioinformatics1471-21052019-12-0120S2311210.1186/s12859-019-3295-2iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networksBinh P. Nguyen0Quang H. Nguyen1Giang-Nam Doan-Ngoc2Thanh-Hoang Nguyen-Vo3Susanto Rahardja4School of Mathematics and Statistics, Victoria University of WellingtonSchool of Information and Communication Technology, Hanoi University of Science and TechnologySchool of Information and Communication Technology, Hanoi University of Science and TechnologySchool of Mathematics and Statistics, Victoria University of WellingtonSchool of Marine Science and Technology, Northwestern Polytechnical UniversityAbstract Background Since protein-DNA interactions are highly essential to diverse biological events, accurately positioning the location of the DNA-binding residues is necessary. This biological issue, however, is currently a challenging task in the age of post-genomic where data on protein sequences have expanded very fast. In this study, we propose iProDNA-CapsNet – a new prediction model identifying protein-DNA binding residues using an ensemble of capsule neural networks (CapsNets) on position specific scoring matrix (PSMM) profiles. The use of CapsNets promises an innovative approach to determine the location of DNA-binding residues. In this study, the benchmark datasets introduced by Hu et al. (2017), i.e., PDNA-543 and PDNA-TEST, were used to train and evaluate the model, respectively. To fairly assess the model performance, comparative analysis between iProDNA-CapsNet and existing state-of-the-art methods was done. Results Under the decision threshold corresponding to false positive rate (FPR) ≈ 5%, the accuracy, sensitivity, precision, and Matthews’s correlation coefficient (MCC) of our model is increased by about 2.0%, 2.0%, 14.0%, and 5.0% with respect to TargetDNA (Hu et al., 2017) and 1.0%, 75.0%, 45.0%, and 77.0% with respect to BindN+ (Wang et al., 2010), respectively. With regards to other methods not reporting their threshold settings, iProDNA-CapsNet also shows a significant improvement in performance based on most of the evaluation metrics. Even with different patterns of change among the models, iProDNA-CapsNets remains to be the best model having top performance in most of the metrics, especially MCC which is boosted from about 8.0% to 220.0%. Conclusions According to all evaluation metrics under various decision thresholds, iProDNA-CapsNet shows better performance compared to the two current best models (BindN and TargetDNA). Our proposed approach also shows that CapsNet can potentially be used and adopted in other biological applications.https://doi.org/10.1186/s12859-019-3295-2Protein-DNA interactionResiduePredictionPSSMCapsule neural networkDeep learning
spellingShingle Binh P. Nguyen
Quang H. Nguyen
Giang-Nam Doan-Ngoc
Thanh-Hoang Nguyen-Vo
Susanto Rahardja
iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
BMC Bioinformatics
Protein-DNA interaction
Residue
Prediction
PSSM
Capsule neural network
Deep learning
title iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_full iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_fullStr iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_full_unstemmed iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_short iProDNA-CapsNet: identifying protein-DNA binding residues using capsule neural networks
title_sort iprodna capsnet identifying protein dna binding residues using capsule neural networks
topic Protein-DNA interaction
Residue
Prediction
PSSM
Capsule neural network
Deep learning
url https://doi.org/10.1186/s12859-019-3295-2
work_keys_str_mv AT binhpnguyen iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT quanghnguyen iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT giangnamdoanngoc iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT thanhhoangnguyenvo iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks
AT susantorahardja iprodnacapsnetidentifyingproteindnabindingresiduesusingcapsuleneuralnetworks