Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
Imbalanced learning problems often occur in application scenarios and are additionally an important research direction in the field of machine learning. Traditional classifiers are substantially less effective for datasets with an imbalanced distribution, especially for high-dimensional longitudinal...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/12/7253 |
_version_ | 1797596218095828992 |
---|---|
author | Xibin Wang Qiong Zhou Hui Li Mei Chen |
author_facet | Xibin Wang Qiong Zhou Hui Li Mei Chen |
author_sort | Xibin Wang |
collection | DOAJ |
description | Imbalanced learning problems often occur in application scenarios and are additionally an important research direction in the field of machine learning. Traditional classifiers are substantially less effective for datasets with an imbalanced distribution, especially for high-dimensional longitudinal data structures. In the medical field, the imbalance of data problem is more common, and correctly identifying samples of the minority class can obtain important information. Moreover, class imbalance in imbalanced AD (Alzheimer’s disease) data presents a significant challenge for machine learning algorithms that assume the data are evenly distributed within the classes. In this paper, we propose a random forest-based feature selection algorithm for imbalanced neuroimaging data classification. The algorithm employs random forest to evaluate the value of each feature and combines the correlation matrix to choose the optimal feature subset, which is applied to imbalanced MRI (magnetic resonance imaging) AD data to identify AD, MCI (mild cognitive impairment), and NC (normal individuals). In addition, we extract multiple features from AD images that can represent 2D and 3D brain information. The effectiveness of the proposed method is verified by the experimental evaluation using the public ADNI (Alzheimer’s neuroimaging initiative) dataset, and results demonstrate that the proposed method has a higher prediction accuracy and AUC (area under the receiver operating characteristic curve) value in NC-AD, MCI-AD, and NC-MCI group data, with the highest accuracy and AUC value for the NC-AD group data. |
first_indexed | 2024-03-11T02:47:28Z |
format | Article |
id | doaj.art-248047ee067a4bc599aec03429e93dcb |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T02:47:28Z |
publishDate | 2023-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-248047ee067a4bc599aec03429e93dcb2023-11-18T09:11:17ZengMDPI AGApplied Sciences2076-34172023-06-011312725310.3390/app13127253Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random ForestXibin Wang0Qiong Zhou1Hui Li2Mei Chen3School of Data Science, Guizhou Institute of Technology, Guiyang 550003, ChinaCollege of Computer Science & Technology, Guizhou University, Guiyang 550025, ChinaCollege of Computer Science & Technology, Guizhou University, Guiyang 550025, ChinaCollege of Computer Science & Technology, Guizhou University, Guiyang 550025, ChinaImbalanced learning problems often occur in application scenarios and are additionally an important research direction in the field of machine learning. Traditional classifiers are substantially less effective for datasets with an imbalanced distribution, especially for high-dimensional longitudinal data structures. In the medical field, the imbalance of data problem is more common, and correctly identifying samples of the minority class can obtain important information. Moreover, class imbalance in imbalanced AD (Alzheimer’s disease) data presents a significant challenge for machine learning algorithms that assume the data are evenly distributed within the classes. In this paper, we propose a random forest-based feature selection algorithm for imbalanced neuroimaging data classification. The algorithm employs random forest to evaluate the value of each feature and combines the correlation matrix to choose the optimal feature subset, which is applied to imbalanced MRI (magnetic resonance imaging) AD data to identify AD, MCI (mild cognitive impairment), and NC (normal individuals). In addition, we extract multiple features from AD images that can represent 2D and 3D brain information. The effectiveness of the proposed method is verified by the experimental evaluation using the public ADNI (Alzheimer’s neuroimaging initiative) dataset, and results demonstrate that the proposed method has a higher prediction accuracy and AUC (area under the receiver operating characteristic curve) value in NC-AD, MCI-AD, and NC-MCI group data, with the highest accuracy and AUC value for the NC-AD group data.https://www.mdpi.com/2076-3417/13/12/7253magnetic resonance imaging (MRI)random forestfeature extractionAlzheimer’s diseaseimbalanced learning |
spellingShingle | Xibin Wang Qiong Zhou Hui Li Mei Chen Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest Applied Sciences magnetic resonance imaging (MRI) random forest feature extraction Alzheimer’s disease imbalanced learning |
title | Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest |
title_full | Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest |
title_fullStr | Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest |
title_full_unstemmed | Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest |
title_short | Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest |
title_sort | enhancing feature selection for imbalanced alzheimer s disease brain mri images by random forest |
topic | magnetic resonance imaging (MRI) random forest feature extraction Alzheimer’s disease imbalanced learning |
url | https://www.mdpi.com/2076-3417/13/12/7253 |
work_keys_str_mv | AT xibinwang enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest AT qiongzhou enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest AT huili enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest AT meichen enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest |