Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest

Imbalanced learning problems often occur in application scenarios and are additionally an important research direction in the field of machine learning. Traditional classifiers are substantially less effective for datasets with an imbalanced distribution, especially for high-dimensional longitudinal...

Full description

Bibliographic Details
Main Authors: Xibin Wang, Qiong Zhou, Hui Li, Mei Chen
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/12/7253
_version_ 1797596218095828992
author Xibin Wang
Qiong Zhou
Hui Li
Mei Chen
author_facet Xibin Wang
Qiong Zhou
Hui Li
Mei Chen
author_sort Xibin Wang
collection DOAJ
description Imbalanced learning problems often occur in application scenarios and are additionally an important research direction in the field of machine learning. Traditional classifiers are substantially less effective for datasets with an imbalanced distribution, especially for high-dimensional longitudinal data structures. In the medical field, the imbalance of data problem is more common, and correctly identifying samples of the minority class can obtain important information. Moreover, class imbalance in imbalanced AD (Alzheimer’s disease) data presents a significant challenge for machine learning algorithms that assume the data are evenly distributed within the classes. In this paper, we propose a random forest-based feature selection algorithm for imbalanced neuroimaging data classification. The algorithm employs random forest to evaluate the value of each feature and combines the correlation matrix to choose the optimal feature subset, which is applied to imbalanced MRI (magnetic resonance imaging) AD data to identify AD, MCI (mild cognitive impairment), and NC (normal individuals). In addition, we extract multiple features from AD images that can represent 2D and 3D brain information. The effectiveness of the proposed method is verified by the experimental evaluation using the public ADNI (Alzheimer’s neuroimaging initiative) dataset, and results demonstrate that the proposed method has a higher prediction accuracy and AUC (area under the receiver operating characteristic curve) value in NC-AD, MCI-AD, and NC-MCI group data, with the highest accuracy and AUC value for the NC-AD group data.
first_indexed 2024-03-11T02:47:28Z
format Article
id doaj.art-248047ee067a4bc599aec03429e93dcb
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T02:47:28Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-248047ee067a4bc599aec03429e93dcb2023-11-18T09:11:17ZengMDPI AGApplied Sciences2076-34172023-06-011312725310.3390/app13127253Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random ForestXibin Wang0Qiong Zhou1Hui Li2Mei Chen3School of Data Science, Guizhou Institute of Technology, Guiyang 550003, ChinaCollege of Computer Science & Technology, Guizhou University, Guiyang 550025, ChinaCollege of Computer Science & Technology, Guizhou University, Guiyang 550025, ChinaCollege of Computer Science & Technology, Guizhou University, Guiyang 550025, ChinaImbalanced learning problems often occur in application scenarios and are additionally an important research direction in the field of machine learning. Traditional classifiers are substantially less effective for datasets with an imbalanced distribution, especially for high-dimensional longitudinal data structures. In the medical field, the imbalance of data problem is more common, and correctly identifying samples of the minority class can obtain important information. Moreover, class imbalance in imbalanced AD (Alzheimer’s disease) data presents a significant challenge for machine learning algorithms that assume the data are evenly distributed within the classes. In this paper, we propose a random forest-based feature selection algorithm for imbalanced neuroimaging data classification. The algorithm employs random forest to evaluate the value of each feature and combines the correlation matrix to choose the optimal feature subset, which is applied to imbalanced MRI (magnetic resonance imaging) AD data to identify AD, MCI (mild cognitive impairment), and NC (normal individuals). In addition, we extract multiple features from AD images that can represent 2D and 3D brain information. The effectiveness of the proposed method is verified by the experimental evaluation using the public ADNI (Alzheimer’s neuroimaging initiative) dataset, and results demonstrate that the proposed method has a higher prediction accuracy and AUC (area under the receiver operating characteristic curve) value in NC-AD, MCI-AD, and NC-MCI group data, with the highest accuracy and AUC value for the NC-AD group data.https://www.mdpi.com/2076-3417/13/12/7253magnetic resonance imaging (MRI)random forestfeature extractionAlzheimer’s diseaseimbalanced learning
spellingShingle Xibin Wang
Qiong Zhou
Hui Li
Mei Chen
Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
Applied Sciences
magnetic resonance imaging (MRI)
random forest
feature extraction
Alzheimer’s disease
imbalanced learning
title Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
title_full Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
title_fullStr Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
title_full_unstemmed Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
title_short Enhancing Feature Selection for Imbalanced Alzheimer’s Disease Brain MRI Images by Random Forest
title_sort enhancing feature selection for imbalanced alzheimer s disease brain mri images by random forest
topic magnetic resonance imaging (MRI)
random forest
feature extraction
Alzheimer’s disease
imbalanced learning
url https://www.mdpi.com/2076-3417/13/12/7253
work_keys_str_mv AT xibinwang enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest
AT qiongzhou enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest
AT huili enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest
AT meichen enhancingfeatureselectionforimbalancedalzheimersdiseasebrainmriimagesbyrandomforest