Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information

BackgroundNoncommunicable diseases (NCDs) constitute a burden on public health. These are best controlled through self-management practices, such as self-information. Fostering patients’ access to health-related information through efficient and accessible channels, such as c...

Full description

Bibliographic Details
Main Authors: Caterina Bérubé, Zsolt Ferenc Kovacs, Elgar Fleisch, Tobias Kowatsch
Format: Article
Language:English
Published: JMIR Publications 2021-12-01
Series:Journal of Medical Internet Research
Online Access:https://www.jmir.org/2021/12/e32161
_version_ 1797735450740260864
author Caterina Bérubé
Zsolt Ferenc Kovacs
Elgar Fleisch
Tobias Kowatsch
author_facet Caterina Bérubé
Zsolt Ferenc Kovacs
Elgar Fleisch
Tobias Kowatsch
author_sort Caterina Bérubé
collection DOAJ
description BackgroundNoncommunicable diseases (NCDs) constitute a burden on public health. These are best controlled through self-management practices, such as self-information. Fostering patients’ access to health-related information through efficient and accessible channels, such as commercial voice assistants (VAs), may support the patients’ ability to make health-related decisions and manage their chronic conditions. ObjectiveThis study aims to evaluate the reliability of the most common VAs (ie, Amazon Alexa, Apple Siri, and Google Assistant) in responding to questions about management of the main NCD. MethodsWe generated health-related questions based on frequently asked questions from health organization, government, medical nonprofit, and other recognized health-related websites about conditions associated with Alzheimer’s disease (AD), lung cancer (LCA), chronic obstructive pulmonary disease, diabetes mellitus (DM), cardiovascular disease, chronic kidney disease (CKD), and cerebrovascular accident (CVA). We then validated them with practicing medical specialists, selecting the 10 most frequent ones. Given the low average frequency of the AD-related questions, we excluded such questions. This resulted in a pool of 60 questions. We submitted the selected questions to VAs in a 3×3×6 fractional factorial design experiment with 3 developers (ie, Amazon, Apple, and Google), 3 modalities (ie, voice only, voice and display, display only), and 6 diseases. We assessed the rate of error-free voice responses and classified the web sources based on previous research (ie, expert, commercial, crowdsourced, or not stated). ResultsGoogle showed the highest total response rate, followed by Amazon and Apple. Moreover, although Amazon and Apple showed a comparable response rate in both voice-and-display and voice-only modalities, Google showed a slightly higher response rate in voice only. The same pattern was observed for the rate of expert sources. When considering the response and expert source rate across diseases, we observed that although Google remained comparable, with a slight advantage for LCA and CKD, both Amazon and Apple showed the highest response rate for LCA. However, both Google and Apple showed most often expert sources for CVA, while Amazon did so for DM. ConclusionsGoogle showed the highest response rate and the highest rate of expert sources, leading to the conclusion that Google Assistant would be the most reliable tool in responding to questions about NCD management. However, the rate of expert sources differed across diseases. We urge health organizations to collaborate with Google, Amazon, and Apple to allow their VAs to consistently provide reliable answers to health-related questions on NCD management across the different diseases.
first_indexed 2024-03-12T12:59:13Z
format Article
id doaj.art-26d96c7278884889a6f08b78d47d060c
institution Directory Open Access Journal
issn 1438-8871
language English
last_indexed 2024-03-12T12:59:13Z
publishDate 2021-12-01
publisher JMIR Publications
record_format Article
series Journal of Medical Internet Research
spelling doaj.art-26d96c7278884889a6f08b78d47d060c2023-08-28T20:02:56ZengJMIR PublicationsJournal of Medical Internet Research1438-88712021-12-012312e3216110.2196/32161Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of InformationCaterina Bérubéhttps://orcid.org/0000-0001-5247-8485Zsolt Ferenc Kovacshttps://orcid.org/0000-0002-8718-2382Elgar Fleischhttps://orcid.org/0000-0002-4842-1117Tobias Kowatschhttps://orcid.org/0000-0001-5939-4145 BackgroundNoncommunicable diseases (NCDs) constitute a burden on public health. These are best controlled through self-management practices, such as self-information. Fostering patients’ access to health-related information through efficient and accessible channels, such as commercial voice assistants (VAs), may support the patients’ ability to make health-related decisions and manage their chronic conditions. ObjectiveThis study aims to evaluate the reliability of the most common VAs (ie, Amazon Alexa, Apple Siri, and Google Assistant) in responding to questions about management of the main NCD. MethodsWe generated health-related questions based on frequently asked questions from health organization, government, medical nonprofit, and other recognized health-related websites about conditions associated with Alzheimer’s disease (AD), lung cancer (LCA), chronic obstructive pulmonary disease, diabetes mellitus (DM), cardiovascular disease, chronic kidney disease (CKD), and cerebrovascular accident (CVA). We then validated them with practicing medical specialists, selecting the 10 most frequent ones. Given the low average frequency of the AD-related questions, we excluded such questions. This resulted in a pool of 60 questions. We submitted the selected questions to VAs in a 3×3×6 fractional factorial design experiment with 3 developers (ie, Amazon, Apple, and Google), 3 modalities (ie, voice only, voice and display, display only), and 6 diseases. We assessed the rate of error-free voice responses and classified the web sources based on previous research (ie, expert, commercial, crowdsourced, or not stated). ResultsGoogle showed the highest total response rate, followed by Amazon and Apple. Moreover, although Amazon and Apple showed a comparable response rate in both voice-and-display and voice-only modalities, Google showed a slightly higher response rate in voice only. The same pattern was observed for the rate of expert sources. When considering the response and expert source rate across diseases, we observed that although Google remained comparable, with a slight advantage for LCA and CKD, both Amazon and Apple showed the highest response rate for LCA. However, both Google and Apple showed most often expert sources for CVA, while Amazon did so for DM. ConclusionsGoogle showed the highest response rate and the highest rate of expert sources, leading to the conclusion that Google Assistant would be the most reliable tool in responding to questions about NCD management. However, the rate of expert sources differed across diseases. We urge health organizations to collaborate with Google, Amazon, and Apple to allow their VAs to consistently provide reliable answers to health-related questions on NCD management across the different diseases.https://www.jmir.org/2021/12/e32161
spellingShingle Caterina Bérubé
Zsolt Ferenc Kovacs
Elgar Fleisch
Tobias Kowatsch
Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information
Journal of Medical Internet Research
title Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information
title_full Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information
title_fullStr Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information
title_full_unstemmed Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information
title_short Reliability of Commercial Voice Assistants’ Responses to Health-Related Questions in Noncommunicable Disease Management: Factorial Experiment Assessing Response Rate and Source of Information
title_sort reliability of commercial voice assistants responses to health related questions in noncommunicable disease management factorial experiment assessing response rate and source of information
url https://www.jmir.org/2021/12/e32161
work_keys_str_mv AT caterinaberube reliabilityofcommercialvoiceassistantsresponsestohealthrelatedquestionsinnoncommunicablediseasemanagementfactorialexperimentassessingresponserateandsourceofinformation
AT zsoltferenckovacs reliabilityofcommercialvoiceassistantsresponsestohealthrelatedquestionsinnoncommunicablediseasemanagementfactorialexperimentassessingresponserateandsourceofinformation
AT elgarfleisch reliabilityofcommercialvoiceassistantsresponsestohealthrelatedquestionsinnoncommunicablediseasemanagementfactorialexperimentassessingresponserateandsourceofinformation
AT tobiaskowatsch reliabilityofcommercialvoiceassistantsresponsestohealthrelatedquestionsinnoncommunicablediseasemanagementfactorialexperimentassessingresponserateandsourceofinformation