iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
Enhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, howev...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9363878/ |
_version_ | 1818933893557911552 |
---|---|
author | Nagina Inayat Mukhtaj Khan Nadeem Iqbal Salman Khan Mushtaq Raza Dost Muhammad Khan Abbas Khan Dong Qing Wei |
author_facet | Nagina Inayat Mukhtaj Khan Nadeem Iqbal Salman Khan Mushtaq Raza Dost Muhammad Khan Abbas Khan Dong Qing Wei |
author_sort | Nagina Inayat |
collection | DOAJ |
description | Enhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, however, the proposed models are unable to identify enhancers and their strength with reasonable accuracy because of high non-linearity in DNA sequences. This article proposes a two-level intelligent model based on Deep Neural Network (DNN) along with multiple feature extraction methods. Firstly, the proposed model represents the given DNA sequences into feature vectors using Pseudo K-tuple Nucleotide Composition (PseKNC) and FastText methods. Secondly, the features vectors are fused to make a heterogeneous features vector that considered the local and global correlation amongst the given sequences along with internal structure information. Finally, the heterogeneous feature vector is given to a DNN model to make final predictions. The proposed iEnhancer-DHF is developed using two-layer approach. The first layer predicts whether the given DNA samples are enhancers or non-enhancers whereas the second layer identifies either the enhancers are strong enhancers or weak enhancers. The outcome of the proposed model was rigorously assessed using both training and independent datasets via 10-fold cross validation method. The validation outcome demonstrated that the iEnhancer-DHF model yielded accuracies 86.07% and 69.60% at first layer and second layer respectively utilizing the training dataset. Similarly, the model yielded accuracies 83.21% and 67.54% at first layer and at second layer respectively by using the independent dataset. Additionally, the outcomes of the proposed model was initially compared with widely applied classifiers such as Support Vector Machine, Random Forest and K-nearest Neighbor and subsequently the performance is compared with the existing models using both the training and independent datasets. The comparison results exhibited that the iEnhancer-DHF model performed superior than the recently published models. |
first_indexed | 2024-12-20T04:55:37Z |
format | Article |
id | doaj.art-ee6d091dcc0a4a55b4a2e3e212489568 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-20T04:55:37Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-ee6d091dcc0a4a55b4a2e3e2124895682022-12-21T19:52:44ZengIEEEIEEE Access2169-35362021-01-019407834079610.1109/ACCESS.2021.30622919363878iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction MethodsNagina Inayat0Mukhtaj Khan1https://orcid.org/0000-0002-4933-6192Nadeem Iqbal2https://orcid.org/0000-0003-1050-1792Salman Khan3https://orcid.org/0000-0002-2905-1755Mushtaq Raza4https://orcid.org/0000-0003-2890-8072Dost Muhammad Khan5https://orcid.org/0000-0002-3919-8136Abbas Khan6Dong Qing Wei7https://orcid.org/0000-0003-4200-7502Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Statistic, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, ChinaDepartment of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, ChinaEnhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, however, the proposed models are unable to identify enhancers and their strength with reasonable accuracy because of high non-linearity in DNA sequences. This article proposes a two-level intelligent model based on Deep Neural Network (DNN) along with multiple feature extraction methods. Firstly, the proposed model represents the given DNA sequences into feature vectors using Pseudo K-tuple Nucleotide Composition (PseKNC) and FastText methods. Secondly, the features vectors are fused to make a heterogeneous features vector that considered the local and global correlation amongst the given sequences along with internal structure information. Finally, the heterogeneous feature vector is given to a DNN model to make final predictions. The proposed iEnhancer-DHF is developed using two-layer approach. The first layer predicts whether the given DNA samples are enhancers or non-enhancers whereas the second layer identifies either the enhancers are strong enhancers or weak enhancers. The outcome of the proposed model was rigorously assessed using both training and independent datasets via 10-fold cross validation method. The validation outcome demonstrated that the iEnhancer-DHF model yielded accuracies 86.07% and 69.60% at first layer and second layer respectively utilizing the training dataset. Similarly, the model yielded accuracies 83.21% and 67.54% at first layer and at second layer respectively by using the independent dataset. Additionally, the outcomes of the proposed model was initially compared with widely applied classifiers such as Support Vector Machine, Random Forest and K-nearest Neighbor and subsequently the performance is compared with the existing models using both the training and independent datasets. The comparison results exhibited that the iEnhancer-DHF model performed superior than the recently published models.https://ieeexplore.ieee.org/document/9363878/DNAsenhancersWord2vecPseKNCdeep learningmachine learning |
spellingShingle | Nagina Inayat Mukhtaj Khan Nadeem Iqbal Salman Khan Mushtaq Raza Dost Muhammad Khan Abbas Khan Dong Qing Wei iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods IEEE Access DNAs enhancers Word2vec PseKNC deep learning machine learning |
title | iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods |
title_full | iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods |
title_fullStr | iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods |
title_full_unstemmed | iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods |
title_short | iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods |
title_sort | ienhancer dhf identification of enhancers and their strengths using optimize deep neural network with multiple features extraction methods |
topic | DNAs enhancers Word2vec PseKNC deep learning machine learning |
url | https://ieeexplore.ieee.org/document/9363878/ |
work_keys_str_mv | AT naginainayat ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT mukhtajkhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT nadeemiqbal ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT salmankhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT mushtaqraza ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT dostmuhammadkhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT abbaskhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods AT dongqingwei ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods |