Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task

The usage of crowdsourcing to recruit numerous participants has been recognized as beneficial in the human-computer interaction (HCI) field, such as for designing user interfaces and validating user performance models. In this work, we investigate its effectiveness for evaluating an error-rate predi...

Full description

Bibliographic Details
Main Author:	Shota Yamanaka
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2022-03-01
Series:	Frontiers in Artificial Intelligence
Subjects:	crowdsourcing graphical user interface Fitts'law user performance models error-rate prediction
Online Access:	https://www.frontiersin.org/articles/10.3389/frai.2022.798892/full

_version_	1818271241150136320
author	Shota Yamanaka
author_facet	Shota Yamanaka
author_sort	Shota Yamanaka
collection	DOAJ
description	The usage of crowdsourcing to recruit numerous participants has been recognized as beneficial in the human-computer interaction (HCI) field, such as for designing user interfaces and validating user performance models. In this work, we investigate its effectiveness for evaluating an error-rate prediction model in target pointing tasks. In contrast to models for operational times, a clicking error (i.e., missing a target) occurs by chance at a certain probability, e.g., 5%. Therefore, in traditional laboratory-based experiments, a lot of repetitions are needed to measure the central tendency of error rates. We hypothesize that recruiting many workers would enable us to keep the number of repetitions per worker much smaller. We collected data from 384 workers and found that existing models on operational time and error rate showed good fits (both R2 > 0.95). A simulation where we changed the number of participants NP and the number of repetitions Nrepeat showed that the time prediction model was robust against small NP and Nrepeat, although the error-rate model fitness was considerably degraded. These findings empirically demonstrate a new utility of crowdsourced user experiments for collecting numerous participants, which should be of great use to HCI researchers for their evaluation studies.
first_indexed	2024-12-12T21:23:02Z
format	Article
id	doaj.art-700573b1d24a46d58fb8d61c262d3bbf
institution	Directory Open Access Journal
issn	2624-8212
language	English
last_indexed	2024-12-12T21:23:02Z
publishDate	2022-03-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Artificial Intelligence
spelling	doaj.art-700573b1d24a46d58fb8d61c262d3bbf2022-12-22T00:11:31ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122022-03-01510.3389/frai.2022.798892798892Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing TaskShota YamanakaThe usage of crowdsourcing to recruit numerous participants has been recognized as beneficial in the human-computer interaction (HCI) field, such as for designing user interfaces and validating user performance models. In this work, we investigate its effectiveness for evaluating an error-rate prediction model in target pointing tasks. In contrast to models for operational times, a clicking error (i.e., missing a target) occurs by chance at a certain probability, e.g., 5%. Therefore, in traditional laboratory-based experiments, a lot of repetitions are needed to measure the central tendency of error rates. We hypothesize that recruiting many workers would enable us to keep the number of repetitions per worker much smaller. We collected data from 384 workers and found that existing models on operational time and error rate showed good fits (both R2 > 0.95). A simulation where we changed the number of participants NP and the number of repetitions Nrepeat showed that the time prediction model was robust against small NP and Nrepeat, although the error-rate model fitness was considerably degraded. These findings empirically demonstrate a new utility of crowdsourced user experiments for collecting numerous participants, which should be of great use to HCI researchers for their evaluation studies.https://www.frontiersin.org/articles/10.3389/frai.2022.798892/fullcrowdsourcinggraphical user interfaceFitts'lawuser performance modelserror-rate prediction
spellingShingle	Shota Yamanaka Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task Frontiers in Artificial Intelligence crowdsourcing graphical user interface Fitts'law user performance models error-rate prediction
title	Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task
title_full	Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task
title_fullStr	Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task
title_full_unstemmed	Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task
title_short	Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task
title_sort	utility of crowdsourced user experiments for measuring the central tendency of user performance a case of error rate model evaluation in a pointing task
topic	crowdsourcing graphical user interface Fitts'law user performance models error-rate prediction
url	https://www.frontiersin.org/articles/10.3389/frai.2022.798892/full
work_keys_str_mv	AT shotayamanaka utilityofcrowdsourceduserexperimentsformeasuringthecentraltendencyofuserperformanceacaseoferrorratemodelevaluationinapointingtask

Utility of Crowdsourced User Experiments for Measuring the Central Tendency of User Performance: A Case of Error-Rate Model Evaluation in a Pointing Task

Similar Items