Assessing the accuracy of automatic speech recognition for psychotherapy

Abstract Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unc...

Full description

Bibliographic Details
Main Authors: Adam S. Miner, Albert Haque, Jason A. Fries, Scott L. Fleming, Denise E. Wilfley, G. Terence Wilson, Arnold Milstein, Dan Jurafsky, Bruce A. Arnow, W. Stewart Agras, Li Fei-Fei, Nigam H. Shah
Format: Article
Language:English
Published: Nature Portfolio 2020-06-01
Series:npj Digital Medicine
Online Access:https://doi.org/10.1038/s41746-020-0285-8
_version_ 1827610172001878016
author Adam S. Miner
Albert Haque
Jason A. Fries
Scott L. Fleming
Denise E. Wilfley
G. Terence Wilson
Arnold Milstein
Dan Jurafsky
Bruce A. Arnow
W. Stewart Agras
Li Fei-Fei
Nigam H. Shah
author_facet Adam S. Miner
Albert Haque
Jason A. Fries
Scott L. Fleming
Denise E. Wilfley
G. Terence Wilson
Arnold Milstein
Dan Jurafsky
Bruce A. Arnow
W. Stewart Agras
Li Fei-Fei
Nigam H. Shah
author_sort Adam S. Miner
collection DOAJ
description Abstract Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unclear which metrics and thresholds are appropriate for different clinical use cases, which may range from population descriptions to individual safety monitoring. Here we show that automatic speech recognition is feasible in psychotherapy, but further improvements in accuracy are needed before widespread use. Our HIPAA-compliant automatic speech recognition system demonstrated a transcription word error rate of 25%. For depression-related utterances, sensitivity was 80% and positive predictive value was 83%. For clinician-identified harm-related sentences, the word error rate was 34%. These results suggest that automatic speech recognition may support understanding of language patterns and subgroup variation in existing treatments but may not be ready for individual-level safety surveillance.
first_indexed 2024-03-09T07:45:03Z
format Article
id doaj.art-18d9cddcaf5d45d09a077f644a1611a5
institution Directory Open Access Journal
issn 2398-6352
language English
last_indexed 2024-03-09T07:45:03Z
publishDate 2020-06-01
publisher Nature Portfolio
record_format Article
series npj Digital Medicine
spelling doaj.art-18d9cddcaf5d45d09a077f644a1611a52023-12-03T03:28:34ZengNature Portfolionpj Digital Medicine2398-63522020-06-01311810.1038/s41746-020-0285-8Assessing the accuracy of automatic speech recognition for psychotherapyAdam S. Miner0Albert Haque1Jason A. Fries2Scott L. Fleming3Denise E. Wilfley4G. Terence Wilson5Arnold Milstein6Dan Jurafsky7Bruce A. Arnow8W. Stewart Agras9Li Fei-Fei10Nigam H. Shah11Department of Psychiatry and Behavioral Sciences, Stanford UniversityDepartment of Computer Science, Stanford UniversityCenter for Biomedical Informatics Research, Stanford UniversityDepartment of Biomedical Data Science, Stanford UniversityDepartments of Psychiatry, Medicine, Pediatrics, and Psychological & Brain Sciences, Washington University in St. LouisGraduate School of Applied and Professional Psychology, Rutgers, the State University of New JerseyClinical Excellence Research Center, Stanford UniversityDepartment of Computer Science, Stanford UniversityDepartment of Psychiatry and Behavioral Sciences, Stanford UniversityDepartment of Psychiatry and Behavioral Sciences, Stanford UniversityDepartment of Computer Science, Stanford UniversityCenter for Biomedical Informatics Research, Stanford UniversityAbstract Accurate transcription of audio recordings in psychotherapy would improve therapy effectiveness, clinician training, and safety monitoring. Although automatic speech recognition software is commercially available, its accuracy in mental health settings has not been well described. It is unclear which metrics and thresholds are appropriate for different clinical use cases, which may range from population descriptions to individual safety monitoring. Here we show that automatic speech recognition is feasible in psychotherapy, but further improvements in accuracy are needed before widespread use. Our HIPAA-compliant automatic speech recognition system demonstrated a transcription word error rate of 25%. For depression-related utterances, sensitivity was 80% and positive predictive value was 83%. For clinician-identified harm-related sentences, the word error rate was 34%. These results suggest that automatic speech recognition may support understanding of language patterns and subgroup variation in existing treatments but may not be ready for individual-level safety surveillance.https://doi.org/10.1038/s41746-020-0285-8
spellingShingle Adam S. Miner
Albert Haque
Jason A. Fries
Scott L. Fleming
Denise E. Wilfley
G. Terence Wilson
Arnold Milstein
Dan Jurafsky
Bruce A. Arnow
W. Stewart Agras
Li Fei-Fei
Nigam H. Shah
Assessing the accuracy of automatic speech recognition for psychotherapy
npj Digital Medicine
title Assessing the accuracy of automatic speech recognition for psychotherapy
title_full Assessing the accuracy of automatic speech recognition for psychotherapy
title_fullStr Assessing the accuracy of automatic speech recognition for psychotherapy
title_full_unstemmed Assessing the accuracy of automatic speech recognition for psychotherapy
title_short Assessing the accuracy of automatic speech recognition for psychotherapy
title_sort assessing the accuracy of automatic speech recognition for psychotherapy
url https://doi.org/10.1038/s41746-020-0285-8
work_keys_str_mv AT adamsminer assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT alberthaque assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT jasonafries assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT scottlfleming assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT deniseewilfley assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT gterencewilson assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT arnoldmilstein assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT danjurafsky assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT bruceaarnow assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT wstewartagras assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT lifeifei assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy
AT nigamhshah assessingtheaccuracyofautomaticspeechrecognitionforpsychotherapy