Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning

Alzheimer’s disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in...

Full description

Bibliographic Details
Main Authors: Hala Alshamlan, Samar Omar, Rehab Aljurayyad, Reham Alabduljabbar
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Diagnostics
Subjects:
Online Access:https://www.mdpi.com/2075-4418/13/10/1771
_version_ 1797600396029460480
author Hala Alshamlan
Samar Omar
Rehab Aljurayyad
Reham Alabduljabbar
author_facet Hala Alshamlan
Samar Omar
Rehab Aljurayyad
Reham Alabduljabbar
author_sort Hala Alshamlan
collection DOAJ
description Alzheimer’s disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in the disease process. The aim of this research is to identify the most effective model for detecting biomarker genes associated with AD using several feature selection methods. We compared the efficiency of feature selection methods with an SVM classifier, including mRMR, CFS, the Chi-Square Test, F-score, and GA. We calculated the accuracy of the SVM classifier using validation methods such as 10-fold cross-validation. We applied these feature selection methods with SVM to a benchmark AD gene expression dataset consisting of 696 samples and 200 genes. The results indicate that the mRMR and F-score feature selection methods with SVM classifier achieved a high accuracy of around 84%, with a number of genes between 20 and 40. Furthermore, the mRMR and F-score feature selection methods with SVM classifier outperformed the GA, Chi-Square Test, and CFS methods. Overall, these findings suggest that the mRMR and F-score feature selection methods with SVM classifier are effective in identifying biomarker genes related to AD and could potentially lead to more accurate diagnosis and treatment of the disease.
first_indexed 2024-03-11T03:47:32Z
format Article
id doaj.art-468286de955d47328dd99a2d9518afa0
institution Directory Open Access Journal
issn 2075-4418
language English
last_indexed 2024-03-11T03:47:32Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Diagnostics
spelling doaj.art-468286de955d47328dd99a2d9518afa02023-11-18T01:04:57ZengMDPI AGDiagnostics2075-44182023-05-011310177110.3390/diagnostics13101771Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine LearningHala Alshamlan0Samar Omar1Rehab Aljurayyad2Reham Alabduljabbar3Department of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaAlzheimer’s disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in the disease process. The aim of this research is to identify the most effective model for detecting biomarker genes associated with AD using several feature selection methods. We compared the efficiency of feature selection methods with an SVM classifier, including mRMR, CFS, the Chi-Square Test, F-score, and GA. We calculated the accuracy of the SVM classifier using validation methods such as 10-fold cross-validation. We applied these feature selection methods with SVM to a benchmark AD gene expression dataset consisting of 696 samples and 200 genes. The results indicate that the mRMR and F-score feature selection methods with SVM classifier achieved a high accuracy of around 84%, with a number of genes between 20 and 40. Furthermore, the mRMR and F-score feature selection methods with SVM classifier outperformed the GA, Chi-Square Test, and CFS methods. Overall, these findings suggest that the mRMR and F-score feature selection methods with SVM classifier are effective in identifying biomarker genes related to AD and could potentially lead to more accurate diagnosis and treatment of the disease.https://www.mdpi.com/2075-4418/13/10/1771data mininggenetic disease predictionAlzheimer diseasegene expressionfeature selectionclassification
spellingShingle Hala Alshamlan
Samar Omar
Rehab Aljurayyad
Reham Alabduljabbar
Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
Diagnostics
data mining
genetic disease prediction
Alzheimer disease
gene expression
feature selection
classification
title Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
title_full Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
title_fullStr Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
title_full_unstemmed Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
title_short Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
title_sort identifying effective feature selection methods for alzheimer s disease biomarker gene detection using machine learning
topic data mining
genetic disease prediction
Alzheimer disease
gene expression
feature selection
classification
url https://www.mdpi.com/2075-4418/13/10/1771
work_keys_str_mv AT halaalshamlan identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning
AT samaromar identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning
AT rehabaljurayyad identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning
AT rehamalabduljabbar identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning