Reliability of crowdsourcing as a method for collecting emotions labels on pictures

Abstract Objective In this paper we study if and under what conditions crowdsourcing can be used as a reliable method for collecting high-quality emotion labels on pictures. To this end, we run a set of crowdsourcing experiments on the widely used IAPS dataset, using the Self-Assessment Manikin (SAM...

Full description

Bibliographic Details
Main Authors: Olga Korovina, Marcos Baez, Fabio Casati
Format: Article
Language:English
Published: BMC 2019-10-01
Series:BMC Research Notes
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13104-019-4764-4
_version_ 1829483554571026432
author Olga Korovina
Marcos Baez
Fabio Casati
author_facet Olga Korovina
Marcos Baez
Fabio Casati
author_sort Olga Korovina
collection DOAJ
description Abstract Objective In this paper we study if and under what conditions crowdsourcing can be used as a reliable method for collecting high-quality emotion labels on pictures. To this end, we run a set of crowdsourcing experiments on the widely used IAPS dataset, using the Self-Assessment Manikin (SAM) emotion collection instrument, in order to rate pictures on valence, arousal and dominance, and explore the consistency of crowdsourced results across multiple runs (reliability) and the level of agreement with the gold labels (quality). In doing so, we explored the impact of targeting populations of different level of reputation (and cost) and collecting varying numbers of ratings per picture. Results The results tell us that crowdsourcing can be a reliable method, reaching excellent levels of reliability and agreement with only 3 ratings per picture for valence and 8 per arousal, with only marginal difference between target populations. Results for dominance were very poor, echoing previous studies on the data collection instrument used. We also observed that specific types of content generate diverging opinions in participants (leading to higher variability or multimodal distributions), which remain consistent across pictures of the same theme. These can inform the data collection and exploitation of crowdsourced emotion datasets.
first_indexed 2024-12-14T22:09:52Z
format Article
id doaj.art-548cf6aabb4c41418f59480b003c04f3
institution Directory Open Access Journal
issn 1756-0500
language English
last_indexed 2024-12-14T22:09:52Z
publishDate 2019-10-01
publisher BMC
record_format Article
series BMC Research Notes
spelling doaj.art-548cf6aabb4c41418f59480b003c04f32022-12-21T22:45:46ZengBMCBMC Research Notes1756-05002019-10-011211610.1186/s13104-019-4764-4Reliability of crowdsourcing as a method for collecting emotions labels on picturesOlga Korovina0Marcos Baez1Fabio Casati2University of TrentoUniversity of TrentoUniversity of TrentoAbstract Objective In this paper we study if and under what conditions crowdsourcing can be used as a reliable method for collecting high-quality emotion labels on pictures. To this end, we run a set of crowdsourcing experiments on the widely used IAPS dataset, using the Self-Assessment Manikin (SAM) emotion collection instrument, in order to rate pictures on valence, arousal and dominance, and explore the consistency of crowdsourced results across multiple runs (reliability) and the level of agreement with the gold labels (quality). In doing so, we explored the impact of targeting populations of different level of reputation (and cost) and collecting varying numbers of ratings per picture. Results The results tell us that crowdsourcing can be a reliable method, reaching excellent levels of reliability and agreement with only 3 ratings per picture for valence and 8 per arousal, with only marginal difference between target populations. Results for dominance were very poor, echoing previous studies on the data collection instrument used. We also observed that specific types of content generate diverging opinions in participants (leading to higher variability or multimodal distributions), which remain consistent across pictures of the same theme. These can inform the data collection and exploitation of crowdsourced emotion datasets.http://link.springer.com/article/10.1186/s13104-019-4764-4Crowdsourcing emotionsEmpirical studyRating behaviorReliability
spellingShingle Olga Korovina
Marcos Baez
Fabio Casati
Reliability of crowdsourcing as a method for collecting emotions labels on pictures
BMC Research Notes
Crowdsourcing emotions
Empirical study
Rating behavior
Reliability
title Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_full Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_fullStr Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_full_unstemmed Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_short Reliability of crowdsourcing as a method for collecting emotions labels on pictures
title_sort reliability of crowdsourcing as a method for collecting emotions labels on pictures
topic Crowdsourcing emotions
Empirical study
Rating behavior
Reliability
url http://link.springer.com/article/10.1186/s13104-019-4764-4
work_keys_str_mv AT olgakorovina reliabilityofcrowdsourcingasamethodforcollectingemotionslabelsonpictures
AT marcosbaez reliabilityofcrowdsourcingasamethodforcollectingemotionslabelsonpictures
AT fabiocasati reliabilityofcrowdsourcingasamethodforcollectingemotionslabelsonpictures