Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning
Alzheimer’s disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Diagnostics |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-4418/13/10/1771 |
_version_ | 1797600396029460480 |
---|---|
author | Hala Alshamlan Samar Omar Rehab Aljurayyad Reham Alabduljabbar |
author_facet | Hala Alshamlan Samar Omar Rehab Aljurayyad Reham Alabduljabbar |
author_sort | Hala Alshamlan |
collection | DOAJ |
description | Alzheimer’s disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in the disease process. The aim of this research is to identify the most effective model for detecting biomarker genes associated with AD using several feature selection methods. We compared the efficiency of feature selection methods with an SVM classifier, including mRMR, CFS, the Chi-Square Test, F-score, and GA. We calculated the accuracy of the SVM classifier using validation methods such as 10-fold cross-validation. We applied these feature selection methods with SVM to a benchmark AD gene expression dataset consisting of 696 samples and 200 genes. The results indicate that the mRMR and F-score feature selection methods with SVM classifier achieved a high accuracy of around 84%, with a number of genes between 20 and 40. Furthermore, the mRMR and F-score feature selection methods with SVM classifier outperformed the GA, Chi-Square Test, and CFS methods. Overall, these findings suggest that the mRMR and F-score feature selection methods with SVM classifier are effective in identifying biomarker genes related to AD and could potentially lead to more accurate diagnosis and treatment of the disease. |
first_indexed | 2024-03-11T03:47:32Z |
format | Article |
id | doaj.art-468286de955d47328dd99a2d9518afa0 |
institution | Directory Open Access Journal |
issn | 2075-4418 |
language | English |
last_indexed | 2024-03-11T03:47:32Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Diagnostics |
spelling | doaj.art-468286de955d47328dd99a2d9518afa02023-11-18T01:04:57ZengMDPI AGDiagnostics2075-44182023-05-011310177110.3390/diagnostics13101771Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine LearningHala Alshamlan0Samar Omar1Rehab Aljurayyad2Reham Alabduljabbar3Department of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaDepartment of Information Technology, College of Computer and Information Sciences, King Saud University, P.O. Box 145111, Riyadh 4545, Saudi ArabiaAlzheimer’s disease (AD) is a complex genetic disorder that affects the brain and has been the focus of many bioinformatics research studies. The primary objective of these studies is to identify and classify genes involved in the progression of AD and to explore the function of these risk genes in the disease process. The aim of this research is to identify the most effective model for detecting biomarker genes associated with AD using several feature selection methods. We compared the efficiency of feature selection methods with an SVM classifier, including mRMR, CFS, the Chi-Square Test, F-score, and GA. We calculated the accuracy of the SVM classifier using validation methods such as 10-fold cross-validation. We applied these feature selection methods with SVM to a benchmark AD gene expression dataset consisting of 696 samples and 200 genes. The results indicate that the mRMR and F-score feature selection methods with SVM classifier achieved a high accuracy of around 84%, with a number of genes between 20 and 40. Furthermore, the mRMR and F-score feature selection methods with SVM classifier outperformed the GA, Chi-Square Test, and CFS methods. Overall, these findings suggest that the mRMR and F-score feature selection methods with SVM classifier are effective in identifying biomarker genes related to AD and could potentially lead to more accurate diagnosis and treatment of the disease.https://www.mdpi.com/2075-4418/13/10/1771data mininggenetic disease predictionAlzheimer diseasegene expressionfeature selectionclassification |
spellingShingle | Hala Alshamlan Samar Omar Rehab Aljurayyad Reham Alabduljabbar Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning Diagnostics data mining genetic disease prediction Alzheimer disease gene expression feature selection classification |
title | Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning |
title_full | Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning |
title_fullStr | Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning |
title_full_unstemmed | Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning |
title_short | Identifying Effective Feature Selection Methods for Alzheimer’s Disease Biomarker Gene Detection Using Machine Learning |
title_sort | identifying effective feature selection methods for alzheimer s disease biomarker gene detection using machine learning |
topic | data mining genetic disease prediction Alzheimer disease gene expression feature selection classification |
url | https://www.mdpi.com/2075-4418/13/10/1771 |
work_keys_str_mv | AT halaalshamlan identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning AT samaromar identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning AT rehabaljurayyad identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning AT rehamalabduljabbar identifyingeffectivefeatureselectionmethodsforalzheimersdiseasebiomarkergenedetectionusingmachinelearning |