Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia

Acoustic speech markers are well-characterized in Frontotemporal Dementia (FTD), a heterogeneous spectrum of progressive neurodegenerative diseases that can affect speech production and comprehension as well as higher-order cognition, behavior, and motor control. While profound apathy and deficits i...

Full description

Bibliographic Details
Main Author: Parllaku, Fjona
Other Authors: Glass, James
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/147525
_version_ 1826211001090965504
author Parllaku, Fjona
author2 Glass, James
author_facet Glass, James
Parllaku, Fjona
author_sort Parllaku, Fjona
collection MIT
description Acoustic speech markers are well-characterized in Frontotemporal Dementia (FTD), a heterogeneous spectrum of progressive neurodegenerative diseases that can affect speech production and comprehension as well as higher-order cognition, behavior, and motor control. While profound apathy and deficits in emotion processing are also common symptoms, emotional content has yet to be explored in acoustic models of speech. We retrospectively analyze a dataset of standard elicited speech tasks from 69 FTD and 131 healthy elderly controls seen at the University of Melbourne. We develop two ResNet50 models to classify FTD vs healthy elderly controls using spectrograms of speech samples: 1) a naive model, and 2) a model that was pretrained on an emotions speech dataset. We compare the validation accuracies of the two models on different speech tasks. The pre-trained model better classifies FTD vs. healthy elderly controls, and the behavioral variant of FTD (bvFTD) vs. healthy elderly controls with validation accuracy scores of 79% and 84% respectively in the monologue speech task, and 93% and 90% in the picture description one. When considered singularly, the ‘happy’ emotion best discriminates between FTD vs healthy elderly controls compared to other latent emotions. Pre-training acoustic models on latent emotion increases the classification accuracy for FTD. We demonstrate the greatest improvement in model performance on elicited speech tasks with greater emotional content. Considered more broadly, our findings suggest that inclusion of latent emotion in acoustic classification models provides a benefit in neurologic diseases that affect emotion.
first_indexed 2024-09-23T14:59:10Z
format Thesis
id mit-1721.1/147525
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:59:10Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1475252023-01-20T03:13:27Z Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia Parllaku, Fjona Glass, James Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Acoustic speech markers are well-characterized in Frontotemporal Dementia (FTD), a heterogeneous spectrum of progressive neurodegenerative diseases that can affect speech production and comprehension as well as higher-order cognition, behavior, and motor control. While profound apathy and deficits in emotion processing are also common symptoms, emotional content has yet to be explored in acoustic models of speech. We retrospectively analyze a dataset of standard elicited speech tasks from 69 FTD and 131 healthy elderly controls seen at the University of Melbourne. We develop two ResNet50 models to classify FTD vs healthy elderly controls using spectrograms of speech samples: 1) a naive model, and 2) a model that was pretrained on an emotions speech dataset. We compare the validation accuracies of the two models on different speech tasks. The pre-trained model better classifies FTD vs. healthy elderly controls, and the behavioral variant of FTD (bvFTD) vs. healthy elderly controls with validation accuracy scores of 79% and 84% respectively in the monologue speech task, and 93% and 90% in the picture description one. When considered singularly, the ‘happy’ emotion best discriminates between FTD vs healthy elderly controls compared to other latent emotions. Pre-training acoustic models on latent emotion increases the classification accuracy for FTD. We demonstrate the greatest improvement in model performance on elicited speech tasks with greater emotional content. Considered more broadly, our findings suggest that inclusion of latent emotion in acoustic classification models provides a benefit in neurologic diseases that affect emotion. M.Eng. 2023-01-19T19:56:09Z 2023-01-19T19:56:09Z 2022-09 2022-09-16T20:24:04.575Z Thesis https://hdl.handle.net/1721.1/147525 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Parllaku, Fjona
Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia
title Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia
title_full Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia
title_fullStr Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia
title_full_unstemmed Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia
title_short Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia
title_sort speech based artificial intelligence emotion biomarkers in frontotemporal dementia
url https://hdl.handle.net/1721.1/147525
work_keys_str_mv AT parllakufjona speechbasedartificialintelligenceemotionbiomarkersinfrontotemporaldementia