Prediction of Parkinson’s Disease Using Machine Learning Methods

The detection of Parkinson’s disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their...

Full description

Bibliographic Details
Main Authors: Jiayu Zhang, Wenchao Zhou, Hongmei Yu, Tong Wang, Xiaqiong Wang, Long Liu, Yalu Wen
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Biomolecules
Subjects:
Online Access:https://www.mdpi.com/2218-273X/13/12/1761
_version_ 1797381802007986176
author Jiayu Zhang
Wenchao Zhou
Hongmei Yu
Tong Wang
Xiaqiong Wang
Long Liu
Yalu Wen
author_facet Jiayu Zhang
Wenchao Zhou
Hongmei Yu
Tong Wang
Xiaqiong Wang
Long Liu
Yalu Wen
author_sort Jiayu Zhang
collection DOAJ
description The detection of Parkinson’s disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their cost and accessibility, and then gradually incorporated them into risk predictions, which were built using eight commonly used machine learning models to allow for comprehensive assessment. Finally, the Shapley Additive Explanations (SHAP) method was used to investigate the contributions of each factor. We found that models built with demographic variables, hospital admission examinations, clinical assessment, and polygenic risk score achieved the best prediction performance, and the inclusion of invasive biomarkers could not further enhance its accuracy. Among the eight machine learning models considered, penalized logistic regression and XGBoost were the most accurate algorithms for assessing PD risk, with penalized logistic regression achieving an area under the curve of 0.94 and a Brier score of 0.08. Olfactory function and polygenic risk scores were the most important predictors for PD risk. Our research has offered a practical framework for PD risk assessment, where necessary information and efficient machine learning tools were highlighted.
first_indexed 2024-03-08T20:57:38Z
format Article
id doaj.art-55d496694a204e2d8460181b7b1dc776
institution Directory Open Access Journal
issn 2218-273X
language English
last_indexed 2024-03-08T20:57:38Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Biomolecules
spelling doaj.art-55d496694a204e2d8460181b7b1dc7762023-12-22T13:56:02ZengMDPI AGBiomolecules2218-273X2023-12-011312176110.3390/biom13121761Prediction of Parkinson’s Disease Using Machine Learning MethodsJiayu Zhang0Wenchao Zhou1Hongmei Yu2Tong Wang3Xiaqiong Wang4Long Liu5Yalu Wen6Department of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Epidemiology and Biostatistics, Southeast University, 87 Ding Jiaqiao Road, Nanjing 210009, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Statistics, University of Auckland, 38 Princes Street, Auckland Central, Auckland 1010, New ZealandThe detection of Parkinson’s disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their cost and accessibility, and then gradually incorporated them into risk predictions, which were built using eight commonly used machine learning models to allow for comprehensive assessment. Finally, the Shapley Additive Explanations (SHAP) method was used to investigate the contributions of each factor. We found that models built with demographic variables, hospital admission examinations, clinical assessment, and polygenic risk score achieved the best prediction performance, and the inclusion of invasive biomarkers could not further enhance its accuracy. Among the eight machine learning models considered, penalized logistic regression and XGBoost were the most accurate algorithms for assessing PD risk, with penalized logistic regression achieving an area under the curve of 0.94 and a Brier score of 0.08. Olfactory function and polygenic risk scores were the most important predictors for PD risk. Our research has offered a practical framework for PD risk assessment, where necessary information and efficient machine learning tools were highlighted.https://www.mdpi.com/2218-273X/13/12/1761machine learningParkinson’s diseasepolygenic risk scoresrisk prediction modelSHAP value
spellingShingle Jiayu Zhang
Wenchao Zhou
Hongmei Yu
Tong Wang
Xiaqiong Wang
Long Liu
Yalu Wen
Prediction of Parkinson’s Disease Using Machine Learning Methods
Biomolecules
machine learning
Parkinson’s disease
polygenic risk scores
risk prediction model
SHAP value
title Prediction of Parkinson’s Disease Using Machine Learning Methods
title_full Prediction of Parkinson’s Disease Using Machine Learning Methods
title_fullStr Prediction of Parkinson’s Disease Using Machine Learning Methods
title_full_unstemmed Prediction of Parkinson’s Disease Using Machine Learning Methods
title_short Prediction of Parkinson’s Disease Using Machine Learning Methods
title_sort prediction of parkinson s disease using machine learning methods
topic machine learning
Parkinson’s disease
polygenic risk scores
risk prediction model
SHAP value
url https://www.mdpi.com/2218-273X/13/12/1761
work_keys_str_mv AT jiayuzhang predictionofparkinsonsdiseaseusingmachinelearningmethods
AT wenchaozhou predictionofparkinsonsdiseaseusingmachinelearningmethods
AT hongmeiyu predictionofparkinsonsdiseaseusingmachinelearningmethods
AT tongwang predictionofparkinsonsdiseaseusingmachinelearningmethods
AT xiaqiongwang predictionofparkinsonsdiseaseusingmachinelearningmethods
AT longliu predictionofparkinsonsdiseaseusingmachinelearningmethods
AT yaluwen predictionofparkinsonsdiseaseusingmachinelearningmethods