Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides

BackgroundThere is a huge number of health-related apps available, and the numbers are growing fast. However, many of them have been developed without any kind of quality control. In an attempt to contribute to the development of high-quality apps and enable existing apps to be assessed, several gui...

Full description

Bibliographic Details
Main Authors:	Miró, Jordi, Llorens-Vernet, Pere
Format:	Article
Language:	English
Published:	JMIR Publications 2021-04-01
Series:	JMIR mHealth and uHealth
Online Access:	https://mhealth.jmir.org/2021/4/e26471

_version_	1818886253292027904
author	Miró, Jordi Llorens-Vernet, Pere
author_facet	Miró, Jordi Llorens-Vernet, Pere
author_sort	Miró, Jordi
collection	DOAJ
description	BackgroundThere is a huge number of health-related apps available, and the numbers are growing fast. However, many of them have been developed without any kind of quality control. In an attempt to contribute to the development of high-quality apps and enable existing apps to be assessed, several guides have been developed. ObjectiveThe main aim of this study was to study the interrater reliability of a new guide — the Mobile App Development and Assessment Guide (MAG) — and compare it with one of the most used guides in the field, the Mobile App Rating Scale (MARS). Moreover, we also focused on whether the interrater reliability of the measures is consistent across multiple types of apps and stakeholders. MethodsIn order to study the interrater reliability of the MAG and MARS, we evaluated the 4 most downloaded health apps for chronic health conditions in the medical category of IOS and Android devices (ie, App Store and Google Play). A group of 8 reviewers, representative of individuals that would be most knowledgeable and interested in the use and development of health-related apps and including different types of stakeholders such as clinical researchers, engineers, health care professionals, and end users as potential patients, independently evaluated the quality of the apps using the MAG and MARS. We calculated the Krippendorff alpha for every category in the 2 guides, for each type of reviewer and every app, separately and combined, to study the interrater reliability. ResultsOnly a few categories of the MAG and MARS demonstrated a high interrater reliability. Although the MAG was found to be superior, there was considerable variation in the scores between the different types of reviewers. The categories with the highest interrater reliability in MAG were “Security” (α=0.78) and “Privacy” (α=0.73). In addition, 2 other categories, “Usability” and “Safety,” were very close to compliance (health care professionals: α=0.62 and 0.61, respectively). The total interrater reliability of the MAG (ie, for all categories) was 0.45, whereas the total interrater reliability of the MARS was 0.29. ConclusionsThis study shows that some categories of MAG have significant interrater reliability. Importantly, the data show that the MAG scores are better than the ones provided by the MARS, which is the most commonly used guide in the area. However, there is great variability in the responses, which seems to be associated with subjective interpretation by the reviewers.
first_indexed	2024-12-19T16:18:24Z
format	Article
id	doaj.art-33dad88a60c24b47b847672d2940a577
institution	Directory Open Access Journal
issn	2291-5222
language	English
last_indexed	2024-12-19T16:18:24Z
publishDate	2021-04-01
publisher	JMIR Publications
record_format	Article
series	JMIR mHealth and uHealth
spelling	doaj.art-33dad88a60c24b47b847672d2940a5772022-12-21T20:14:34ZengJMIR PublicationsJMIR mHealth and uHealth2291-52222021-04-0194e2647110.2196/26471Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two GuidesMiró, JordiLlorens-Vernet, PereBackgroundThere is a huge number of health-related apps available, and the numbers are growing fast. However, many of them have been developed without any kind of quality control. In an attempt to contribute to the development of high-quality apps and enable existing apps to be assessed, several guides have been developed. ObjectiveThe main aim of this study was to study the interrater reliability of a new guide — the Mobile App Development and Assessment Guide (MAG) — and compare it with one of the most used guides in the field, the Mobile App Rating Scale (MARS). Moreover, we also focused on whether the interrater reliability of the measures is consistent across multiple types of apps and stakeholders. MethodsIn order to study the interrater reliability of the MAG and MARS, we evaluated the 4 most downloaded health apps for chronic health conditions in the medical category of IOS and Android devices (ie, App Store and Google Play). A group of 8 reviewers, representative of individuals that would be most knowledgeable and interested in the use and development of health-related apps and including different types of stakeholders such as clinical researchers, engineers, health care professionals, and end users as potential patients, independently evaluated the quality of the apps using the MAG and MARS. We calculated the Krippendorff alpha for every category in the 2 guides, for each type of reviewer and every app, separately and combined, to study the interrater reliability. ResultsOnly a few categories of the MAG and MARS demonstrated a high interrater reliability. Although the MAG was found to be superior, there was considerable variation in the scores between the different types of reviewers. The categories with the highest interrater reliability in MAG were “Security” (α=0.78) and “Privacy” (α=0.73). In addition, 2 other categories, “Usability” and “Safety,” were very close to compliance (health care professionals: α=0.62 and 0.61, respectively). The total interrater reliability of the MAG (ie, for all categories) was 0.45, whereas the total interrater reliability of the MARS was 0.29. ConclusionsThis study shows that some categories of MAG have significant interrater reliability. Importantly, the data show that the MAG scores are better than the ones provided by the MARS, which is the most commonly used guide in the area. However, there is great variability in the responses, which seems to be associated with subjective interpretation by the reviewers.https://mhealth.jmir.org/2021/4/e26471
spellingShingle	Miró, Jordi Llorens-Vernet, Pere Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides JMIR mHealth and uHealth
title	Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides
title_full	Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides
title_fullStr	Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides
title_full_unstemmed	Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides
title_short	Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides
title_sort	assessing the quality of mobile health related apps interrater reliability study of two guides
url	https://mhealth.jmir.org/2021/4/e26471
work_keys_str_mv	AT mirojordi assessingthequalityofmobilehealthrelatedappsinterraterreliabilitystudyoftwoguides AT llorensvernetpere assessingthequalityofmobilehealthrelatedappsinterraterreliabilitystudyoftwoguides

Assessing the Quality of Mobile Health-Related Apps: Interrater Reliability Study of Two Guides

Similar Items