An Ensemble-Learning Based Application to Predict the Earlier Stages of Alzheimer’s Disease (AD)

The fact that ensemble methods enhance the prediction performance. Therefore, we focused on developing a weighted ensemble method using a novel combination of Cerebrospinal Fluid (CSF) protein biomarkers to predict AD's earlier stages with greater accuracy than the state-of-the-art CSF protein...

Full description

Bibliographic Details
Main Authors: Asif Hassan Syed, Tabrej Khan, Atif Hassan, Nashwan A. Alromema, Muhammad Binsawad, Alhuseen Omar Alsayed
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9288747/
Description
Summary:The fact that ensemble methods enhance the prediction performance. Therefore, we focused on developing a weighted ensemble method using a novel combination of Cerebrospinal Fluid (CSF) protein biomarkers to predict AD's earlier stages with greater accuracy than the state-of-the-art CSF protein biomarkers. In this regard, two feature selection methods, namely the Recursive Feature Elimination (RFE) and L1 regularization method were used to screen the most important subset of features for building a classification model using the Mild Cognitive Impairment (MCI) dataset. A novel combination of three biomarkers, namely Cystatin C, Matrix metalloproteinases (MMP10), and tau protein, was screened using the linear Support Vector Machine (SVM) and Logistic Regression (LR) classifier based RFE method. Two-tailed unpaired t-test analysis at a 5% significance level showed a significant difference between the mean levels of Cystatin C, MMP10, and tau protein between cognitive normal and cognitively impaired groups. An ensemble model using a weighted average of two best performing classifiers (LR and Linear SVM) was created using a novel subset of three most informative features. Our ensemble model's weighted average results performed significantly better than LR and Linear SVM base classifiers' performance. The Receiver Operating Characteristic Curve (ROC_AUC) and Area under Precision-Recall values (AUPR) of our proposed model were observed to be 0.9799 ± 0.055 0.9108 ± 0.015, respectively. The performance of our proposed weighted averaged ensemble model built using a novel combination of CSF protein biomarkers was significantly better (p <; 0.001) than models generated using different combinations of CSF protein biomarkers obtained from recent studies. An ensemble-learning based application was implemented and deployed at Heroku at https://appsalzheimer.herokuapp.com.
ISSN:2169-3536