Prediction of Parkinson’s Disease Using Machine Learning Methods
The detection of Parkinson’s disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Biomolecules |
Subjects: | |
Online Access: | https://www.mdpi.com/2218-273X/13/12/1761 |
_version_ | 1797381802007986176 |
---|---|
author | Jiayu Zhang Wenchao Zhou Hongmei Yu Tong Wang Xiaqiong Wang Long Liu Yalu Wen |
author_facet | Jiayu Zhang Wenchao Zhou Hongmei Yu Tong Wang Xiaqiong Wang Long Liu Yalu Wen |
author_sort | Jiayu Zhang |
collection | DOAJ |
description | The detection of Parkinson’s disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their cost and accessibility, and then gradually incorporated them into risk predictions, which were built using eight commonly used machine learning models to allow for comprehensive assessment. Finally, the Shapley Additive Explanations (SHAP) method was used to investigate the contributions of each factor. We found that models built with demographic variables, hospital admission examinations, clinical assessment, and polygenic risk score achieved the best prediction performance, and the inclusion of invasive biomarkers could not further enhance its accuracy. Among the eight machine learning models considered, penalized logistic regression and XGBoost were the most accurate algorithms for assessing PD risk, with penalized logistic regression achieving an area under the curve of 0.94 and a Brier score of 0.08. Olfactory function and polygenic risk scores were the most important predictors for PD risk. Our research has offered a practical framework for PD risk assessment, where necessary information and efficient machine learning tools were highlighted. |
first_indexed | 2024-03-08T20:57:38Z |
format | Article |
id | doaj.art-55d496694a204e2d8460181b7b1dc776 |
institution | Directory Open Access Journal |
issn | 2218-273X |
language | English |
last_indexed | 2024-03-08T20:57:38Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Biomolecules |
spelling | doaj.art-55d496694a204e2d8460181b7b1dc7762023-12-22T13:56:02ZengMDPI AGBiomolecules2218-273X2023-12-011312176110.3390/biom13121761Prediction of Parkinson’s Disease Using Machine Learning MethodsJiayu Zhang0Wenchao Zhou1Hongmei Yu2Tong Wang3Xiaqiong Wang4Long Liu5Yalu Wen6Department of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Epidemiology and Biostatistics, Southeast University, 87 Ding Jiaqiao Road, Nanjing 210009, ChinaDepartment of Health Statistics, School of Public Health, Shanxi Medical University, No. 56 Xinjian South Road, Yingze District, Taiyuan 030001, ChinaDepartment of Statistics, University of Auckland, 38 Princes Street, Auckland Central, Auckland 1010, New ZealandThe detection of Parkinson’s disease (PD) in its early stages is of great importance for its treatment and management, but consensus is lacking on what information is necessary and what models should be used to best predict PD risk. In our study, we first grouped PD-associated factors based on their cost and accessibility, and then gradually incorporated them into risk predictions, which were built using eight commonly used machine learning models to allow for comprehensive assessment. Finally, the Shapley Additive Explanations (SHAP) method was used to investigate the contributions of each factor. We found that models built with demographic variables, hospital admission examinations, clinical assessment, and polygenic risk score achieved the best prediction performance, and the inclusion of invasive biomarkers could not further enhance its accuracy. Among the eight machine learning models considered, penalized logistic regression and XGBoost were the most accurate algorithms for assessing PD risk, with penalized logistic regression achieving an area under the curve of 0.94 and a Brier score of 0.08. Olfactory function and polygenic risk scores were the most important predictors for PD risk. Our research has offered a practical framework for PD risk assessment, where necessary information and efficient machine learning tools were highlighted.https://www.mdpi.com/2218-273X/13/12/1761machine learningParkinson’s diseasepolygenic risk scoresrisk prediction modelSHAP value |
spellingShingle | Jiayu Zhang Wenchao Zhou Hongmei Yu Tong Wang Xiaqiong Wang Long Liu Yalu Wen Prediction of Parkinson’s Disease Using Machine Learning Methods Biomolecules machine learning Parkinson’s disease polygenic risk scores risk prediction model SHAP value |
title | Prediction of Parkinson’s Disease Using Machine Learning Methods |
title_full | Prediction of Parkinson’s Disease Using Machine Learning Methods |
title_fullStr | Prediction of Parkinson’s Disease Using Machine Learning Methods |
title_full_unstemmed | Prediction of Parkinson’s Disease Using Machine Learning Methods |
title_short | Prediction of Parkinson’s Disease Using Machine Learning Methods |
title_sort | prediction of parkinson s disease using machine learning methods |
topic | machine learning Parkinson’s disease polygenic risk scores risk prediction model SHAP value |
url | https://www.mdpi.com/2218-273X/13/12/1761 |
work_keys_str_mv | AT jiayuzhang predictionofparkinsonsdiseaseusingmachinelearningmethods AT wenchaozhou predictionofparkinsonsdiseaseusingmachinelearningmethods AT hongmeiyu predictionofparkinsonsdiseaseusingmachinelearningmethods AT tongwang predictionofparkinsonsdiseaseusingmachinelearningmethods AT xiaqiongwang predictionofparkinsonsdiseaseusingmachinelearningmethods AT longliu predictionofparkinsonsdiseaseusingmachinelearningmethods AT yaluwen predictionofparkinsonsdiseaseusingmachinelearningmethods |