Speech-Based Artificial Intelligence Emotion Biomarkers in Frontotemporal Dementia

Acoustic speech markers are well-characterized in Frontotemporal Dementia (FTD), a heterogeneous spectrum of progressive neurodegenerative diseases that can affect speech production and comprehension as well as higher-order cognition, behavior, and motor control. While profound apathy and deficits i...

Full description

Bibliographic Details
Main Author: Parllaku, Fjona
Other Authors: Glass, James
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/147525
Description
Summary:Acoustic speech markers are well-characterized in Frontotemporal Dementia (FTD), a heterogeneous spectrum of progressive neurodegenerative diseases that can affect speech production and comprehension as well as higher-order cognition, behavior, and motor control. While profound apathy and deficits in emotion processing are also common symptoms, emotional content has yet to be explored in acoustic models of speech. We retrospectively analyze a dataset of standard elicited speech tasks from 69 FTD and 131 healthy elderly controls seen at the University of Melbourne. We develop two ResNet50 models to classify FTD vs healthy elderly controls using spectrograms of speech samples: 1) a naive model, and 2) a model that was pretrained on an emotions speech dataset. We compare the validation accuracies of the two models on different speech tasks. The pre-trained model better classifies FTD vs. healthy elderly controls, and the behavioral variant of FTD (bvFTD) vs. healthy elderly controls with validation accuracy scores of 79% and 84% respectively in the monologue speech task, and 93% and 90% in the picture description one. When considered singularly, the ‘happy’ emotion best discriminates between FTD vs healthy elderly controls compared to other latent emotions. Pre-training acoustic models on latent emotion increases the classification accuracy for FTD. We demonstrate the greatest improvement in model performance on elicited speech tasks with greater emotional content. Considered more broadly, our findings suggest that inclusion of latent emotion in acoustic classification models provides a benefit in neurologic diseases that affect emotion.