iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods

Enhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, howev...

Full description

Bibliographic Details
Main Authors: Nagina Inayat, Mukhtaj Khan, Nadeem Iqbal, Salman Khan, Mushtaq Raza, Dost Muhammad Khan, Abbas Khan, Dong Qing Wei
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9363878/
_version_ 1818933893557911552
author Nagina Inayat
Mukhtaj Khan
Nadeem Iqbal
Salman Khan
Mushtaq Raza
Dost Muhammad Khan
Abbas Khan
Dong Qing Wei
author_facet Nagina Inayat
Mukhtaj Khan
Nadeem Iqbal
Salman Khan
Mushtaq Raza
Dost Muhammad Khan
Abbas Khan
Dong Qing Wei
author_sort Nagina Inayat
collection DOAJ
description Enhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, however, the proposed models are unable to identify enhancers and their strength with reasonable accuracy because of high non-linearity in DNA sequences. This article proposes a two-level intelligent model based on Deep Neural Network (DNN) along with multiple feature extraction methods. Firstly, the proposed model represents the given DNA sequences into feature vectors using Pseudo K-tuple Nucleotide Composition (PseKNC) and FastText methods. Secondly, the features vectors are fused to make a heterogeneous features vector that considered the local and global correlation amongst the given sequences along with internal structure information. Finally, the heterogeneous feature vector is given to a DNN model to make final predictions. The proposed iEnhancer-DHF is developed using two-layer approach. The first layer predicts whether the given DNA samples are enhancers or non-enhancers whereas the second layer identifies either the enhancers are strong enhancers or weak enhancers. The outcome of the proposed model was rigorously assessed using both training and independent datasets via 10-fold cross validation method. The validation outcome demonstrated that the iEnhancer-DHF model yielded accuracies 86.07% and 69.60% at first layer and second layer respectively utilizing the training dataset. Similarly, the model yielded accuracies 83.21% and 67.54% at first layer and at second layer respectively by using the independent dataset. Additionally, the outcomes of the proposed model was initially compared with widely applied classifiers such as Support Vector Machine, Random Forest and K-nearest Neighbor and subsequently the performance is compared with the existing models using both the training and independent datasets. The comparison results exhibited that the iEnhancer-DHF model performed superior than the recently published models.
first_indexed 2024-12-20T04:55:37Z
format Article
id doaj.art-ee6d091dcc0a4a55b4a2e3e212489568
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T04:55:37Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-ee6d091dcc0a4a55b4a2e3e2124895682022-12-21T19:52:44ZengIEEEIEEE Access2169-35362021-01-019407834079610.1109/ACCESS.2021.30622919363878iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction MethodsNagina Inayat0Mukhtaj Khan1https://orcid.org/0000-0002-4933-6192Nadeem Iqbal2https://orcid.org/0000-0003-1050-1792Salman Khan3https://orcid.org/0000-0002-2905-1755Mushtaq Raza4https://orcid.org/0000-0003-2890-8072Dost Muhammad Khan5https://orcid.org/0000-0002-3919-8136Abbas Khan6Dong Qing Wei7https://orcid.org/0000-0003-4200-7502Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Computer Science, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Statistic, Abdul Wali Khan University Mardan, Mardan, PakistanDepartment of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, ChinaDepartment of Bioinformatics and Biological Statistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, ChinaEnhancers are short DNA regulatory elements which play a vital role in gene expression. Due to their important roles in genomics, several computational models have been proposed in the literature for identification of enhancers and their strengths using traditional machine learning algorithms, however, the proposed models are unable to identify enhancers and their strength with reasonable accuracy because of high non-linearity in DNA sequences. This article proposes a two-level intelligent model based on Deep Neural Network (DNN) along with multiple feature extraction methods. Firstly, the proposed model represents the given DNA sequences into feature vectors using Pseudo K-tuple Nucleotide Composition (PseKNC) and FastText methods. Secondly, the features vectors are fused to make a heterogeneous features vector that considered the local and global correlation amongst the given sequences along with internal structure information. Finally, the heterogeneous feature vector is given to a DNN model to make final predictions. The proposed iEnhancer-DHF is developed using two-layer approach. The first layer predicts whether the given DNA samples are enhancers or non-enhancers whereas the second layer identifies either the enhancers are strong enhancers or weak enhancers. The outcome of the proposed model was rigorously assessed using both training and independent datasets via 10-fold cross validation method. The validation outcome demonstrated that the iEnhancer-DHF model yielded accuracies 86.07% and 69.60% at first layer and second layer respectively utilizing the training dataset. Similarly, the model yielded accuracies 83.21% and 67.54% at first layer and at second layer respectively by using the independent dataset. Additionally, the outcomes of the proposed model was initially compared with widely applied classifiers such as Support Vector Machine, Random Forest and K-nearest Neighbor and subsequently the performance is compared with the existing models using both the training and independent datasets. The comparison results exhibited that the iEnhancer-DHF model performed superior than the recently published models.https://ieeexplore.ieee.org/document/9363878/DNAsenhancersWord2vecPseKNCdeep learningmachine learning
spellingShingle Nagina Inayat
Mukhtaj Khan
Nadeem Iqbal
Salman Khan
Mushtaq Raza
Dost Muhammad Khan
Abbas Khan
Dong Qing Wei
iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
IEEE Access
DNAs
enhancers
Word2vec
PseKNC
deep learning
machine learning
title iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
title_full iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
title_fullStr iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
title_full_unstemmed iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
title_short iEnhancer-DHF: Identification of Enhancers and Their Strengths Using Optimize Deep Neural Network With Multiple Features Extraction Methods
title_sort ienhancer dhf identification of enhancers and their strengths using optimize deep neural network with multiple features extraction methods
topic DNAs
enhancers
Word2vec
PseKNC
deep learning
machine learning
url https://ieeexplore.ieee.org/document/9363878/
work_keys_str_mv AT naginainayat ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT mukhtajkhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT nadeemiqbal ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT salmankhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT mushtaqraza ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT dostmuhammadkhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT abbaskhan ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods
AT dongqingwei ienhancerdhfidentificationofenhancersandtheirstrengthsusingoptimizedeepneuralnetworkwithmultiplefeaturesextractionmethods