An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition

Enhancers are sequences with short motifs that exhibit high positional variability and free scattering properties. Identification of these noncoding DNA fragments and their strength are extremely important because they play a key role in controlling gene regulation on a cellular basis. The identific...

Full description

Bibliographic Details
Main Authors: Suliman Aladhadh, Saleh A. Almatroodi, Shabana Habib, Abdulatif Alabdulatif, Saeed Ullah Khattak, Muhammad Islam
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Biomolecules
Subjects:
Online Access:https://www.mdpi.com/2218-273X/13/1/70
_version_ 1797445530942439424
author Suliman Aladhadh
Saleh A. Almatroodi
Shabana Habib
Abdulatif Alabdulatif
Saeed Ullah Khattak
Muhammad Islam
author_facet Suliman Aladhadh
Saleh A. Almatroodi
Shabana Habib
Abdulatif Alabdulatif
Saeed Ullah Khattak
Muhammad Islam
author_sort Suliman Aladhadh
collection DOAJ
description Enhancers are sequences with short motifs that exhibit high positional variability and free scattering properties. Identification of these noncoding DNA fragments and their strength are extremely important because they play a key role in controlling gene regulation on a cellular basis. The identification of enhancers is more complex than that of other factors in the genome because they are freely scattered, and their location varies widely. In recent years, bioinformatics tools have enabled significant improvement in identifying this biological difficulty. Cell line-specific screening is not possible using these existing computational methods based solely on DNA sequences. DNA segment chromatin accessibility may provide useful information about its potential function in regulation, thereby identifying regulatory elements based on its chromatin accessibility. In chromatin, the entanglement structure allows positions far apart in the sequence to encounter each other, regardless of their proximity to the gene to be acted upon. Thus, identifying enhancers and assessing their strength is difficult and time-consuming. The goal of our work was to overcome these limitations by presenting a convolutional neural network (CNN) with attention-gated recurrent units (AttGRU) based on Deep Learning. It used a CNN and one-hot coding to build models, primarily to identify enhancers and secondarily to classify their strength. To test the performance of the proposed model, parallels were drawn between enhancer-CNNAttGRU and existing state-of-the-art methods to enable comparisons. The proposed model performed the best for predicting stage one and stage two enhancer sequences, as well as their strengths, in a cross-species analysis, achieving best accuracy values of 87.39% and 84.46%, respectively. Overall, the results showed that the proposed model provided comparable results to state-of-the-art models, highlighting its usefulness.
first_indexed 2024-03-09T13:27:08Z
format Article
id doaj.art-463bc3956f96451ba7c227d8e0922151
institution Directory Open Access Journal
issn 2218-273X
language English
last_indexed 2024-03-09T13:27:08Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Biomolecules
spelling doaj.art-463bc3956f96451ba7c227d8e09221512023-11-30T21:22:18ZengMDPI AGBiomolecules2218-273X2022-12-011317010.3390/biom13010070An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence RecognitionSuliman Aladhadh0Saleh A. Almatroodi1Shabana Habib2Abdulatif Alabdulatif3Saeed Ullah Khattak4Muhammad Islam5Department of Information Technology, College of Computer, Qassim University, Buraydah 51452, Saudi ArabiaDepartment of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi ArabiaDepartment of Information Technology, College of Computer, Qassim University, Buraydah 51452, Saudi ArabiaDepartment of Computer Science, College of Computer, Qassim University, Buraydah 51452, Saudi ArabiaCentre of Biotechnology and Microbiology, University of Peshawar, Peshawar 25120, PakistanDepartment of Electrical Engineering, College of Engineering and Information Technology, Onaizah Colleges, Onaizah 56447, Saudi ArabiaEnhancers are sequences with short motifs that exhibit high positional variability and free scattering properties. Identification of these noncoding DNA fragments and their strength are extremely important because they play a key role in controlling gene regulation on a cellular basis. The identification of enhancers is more complex than that of other factors in the genome because they are freely scattered, and their location varies widely. In recent years, bioinformatics tools have enabled significant improvement in identifying this biological difficulty. Cell line-specific screening is not possible using these existing computational methods based solely on DNA sequences. DNA segment chromatin accessibility may provide useful information about its potential function in regulation, thereby identifying regulatory elements based on its chromatin accessibility. In chromatin, the entanglement structure allows positions far apart in the sequence to encounter each other, regardless of their proximity to the gene to be acted upon. Thus, identifying enhancers and assessing their strength is difficult and time-consuming. The goal of our work was to overcome these limitations by presenting a convolutional neural network (CNN) with attention-gated recurrent units (AttGRU) based on Deep Learning. It used a CNN and one-hot coding to build models, primarily to identify enhancers and secondarily to classify their strength. To test the performance of the proposed model, parallels were drawn between enhancer-CNNAttGRU and existing state-of-the-art methods to enable comparisons. The proposed model performed the best for predicting stage one and stage two enhancer sequences, as well as their strengths, in a cross-species analysis, achieving best accuracy values of 87.39% and 84.46%, respectively. Overall, the results showed that the proposed model provided comparable results to state-of-the-art models, highlighting its usefulness.https://www.mdpi.com/2218-273X/13/1/70deep learningenhancer sequenceconvolution neural networksequential learning modelstemporal attention mechanism
spellingShingle Suliman Aladhadh
Saleh A. Almatroodi
Shabana Habib
Abdulatif Alabdulatif
Saeed Ullah Khattak
Muhammad Islam
An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
Biomolecules
deep learning
enhancer sequence
convolution neural network
sequential learning models
temporal attention mechanism
title An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_full An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_fullStr An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_full_unstemmed An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_short An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition
title_sort efficient lightweight hybrid model with attention mechanism for enhancer sequence recognition
topic deep learning
enhancer sequence
convolution neural network
sequential learning models
temporal attention mechanism
url https://www.mdpi.com/2218-273X/13/1/70
work_keys_str_mv AT sulimanaladhadh anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT salehaalmatroodi anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT shabanahabib anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT abdulatifalabdulatif anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT saeedullahkhattak anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT muhammadislam anefficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT sulimanaladhadh efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT salehaalmatroodi efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT shabanahabib efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT abdulatifalabdulatif efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT saeedullahkhattak efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition
AT muhammadislam efficientlightweighthybridmodelwithattentionmechanismforenhancersequencerecognition