Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker

Objectives This study presents PD Predict, a machine learning system for Parkinson disease classification using voice as a biomarker. Methods We first created an original set of recordings from the mPower study, and then extracted several audio features, such as mel-frequency cepstral coefficient (M...

Full description

Bibliographic Details
Main Authors:	Ilias Tougui, Abdelilah Jilbab, Jamal El Mhamdi
Format:	Article
Language:	English
Published:	The Korean Society of Medical Informatics 2022-07-01
Series:	Healthcare Informatics Research
Subjects:	parkinson disease voice disorders machine learning diagnosis computer-assisted medical informatics applications
Online Access:	http://www.e-hir.org/upload/pdf/hir-2022-28-3-210.pdf

_version_	1811186825638707200
author	Ilias Tougui Abdelilah Jilbab Jamal El Mhamdi
author_facet	Ilias Tougui Abdelilah Jilbab Jamal El Mhamdi
author_sort	Ilias Tougui
collection	DOAJ
description	Objectives This study presents PD Predict, a machine learning system for Parkinson disease classification using voice as a biomarker. Methods We first created an original set of recordings from the mPower study, and then extracted several audio features, such as mel-frequency cepstral coefficient (MFCC) components and other classical speech features, using a windowing procedure. The generated dataset was then divided into training and holdout sets. The training set was used to train two machine learning pipelines, and their performance was estimated using a nested subject-wise cross-validation approach. The holdout set was used to assess the generalizability of the pipelines for unseen data. The final pipelines were implemented in PD Predict and accessed through a prediction endpoint developed using the Django REST Framework. PD Predict is a two-component system: a desktop application that records audio recordings, extracts audio features, and makes predictions; and a server-side web application that implements the machine learning pipelines and processes incoming requests with the extracted audio features to make predictions. Our system is deployed and accessible via the following link: https://pdpredict.herokuapp.com/. Results Both machine learning pipelines showed moderate performance, between 65% and 75% using the nested subject-wise cross-validation approach. Furthermore, they generalized well to unseen data and they did not overfit the training set. Conclusions The architecture of PD Predict is clear, and the performance of the implemented machine learning pipelines is promising and confirms the usability of smartphone microphones for capturing digital biomarkers of disease.
first_indexed	2024-04-11T13:51:40Z
format	Article
id	doaj.art-906f684ba15549d7a99a9b50e4ef3061
institution	Directory Open Access Journal
issn	2093-3681 2093-369X
language	English
last_indexed	2024-04-11T13:51:40Z
publishDate	2022-07-01
publisher	The Korean Society of Medical Informatics
record_format	Article
series	Healthcare Informatics Research
spelling	doaj.art-906f684ba15549d7a99a9b50e4ef30612022-12-22T04:20:33ZengThe Korean Society of Medical InformaticsHealthcare Informatics Research2093-36812093-369X2022-07-0128321022110.4258/hir.2022.28.3.2101122Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a BiomarkerIlias Tougui0Abdelilah Jilbab1Jamal El Mhamdi2E2SN, ENSIAS, Mohammed V University in Rabat, Rabat, MoroccoE2SN, ENSIAS, Mohammed V University in Rabat, Rabat, MoroccoE2SN, ENSIAS, Mohammed V University in Rabat, Rabat, MoroccoObjectives This study presents PD Predict, a machine learning system for Parkinson disease classification using voice as a biomarker. Methods We first created an original set of recordings from the mPower study, and then extracted several audio features, such as mel-frequency cepstral coefficient (MFCC) components and other classical speech features, using a windowing procedure. The generated dataset was then divided into training and holdout sets. The training set was used to train two machine learning pipelines, and their performance was estimated using a nested subject-wise cross-validation approach. The holdout set was used to assess the generalizability of the pipelines for unseen data. The final pipelines were implemented in PD Predict and accessed through a prediction endpoint developed using the Django REST Framework. PD Predict is a two-component system: a desktop application that records audio recordings, extracts audio features, and makes predictions; and a server-side web application that implements the machine learning pipelines and processes incoming requests with the extracted audio features to make predictions. Our system is deployed and accessible via the following link: https://pdpredict.herokuapp.com/. Results Both machine learning pipelines showed moderate performance, between 65% and 75% using the nested subject-wise cross-validation approach. Furthermore, they generalized well to unseen data and they did not overfit the training set. Conclusions The architecture of PD Predict is clear, and the performance of the implemented machine learning pipelines is promising and confirms the usability of smartphone microphones for capturing digital biomarkers of disease.http://www.e-hir.org/upload/pdf/hir-2022-28-3-210.pdfparkinson diseasevoice disordersmachine learningdiagnosiscomputer-assistedmedical informatics applications
spellingShingle	Ilias Tougui Abdelilah Jilbab Jamal El Mhamdi Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker Healthcare Informatics Research parkinson disease voice disorders machine learning diagnosis computer-assisted medical informatics applications
title	Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker
title_full	Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker
title_fullStr	Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker
title_full_unstemmed	Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker
title_short	Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker
title_sort	machine learning smart system for parkinson disease classification using the voice as a biomarker
topic	parkinson disease voice disorders machine learning diagnosis computer-assisted medical informatics applications
url	http://www.e-hir.org/upload/pdf/hir-2022-28-3-210.pdf
work_keys_str_mv	AT iliastougui machinelearningsmartsystemforparkinsondiseaseclassificationusingthevoiceasabiomarker AT abdelilahjilbab machinelearningsmartsystemforparkinsondiseaseclassificationusingthevoiceasabiomarker AT jamalelmhamdi machinelearningsmartsystemforparkinsondiseaseclassificationusingthevoiceasabiomarker

Machine Learning Smart System for Parkinson Disease Classification Using the Voice as a Biomarker

Similar Items