Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning

This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object....

Full description

Bibliographic Details
Main Authors:	Romena Yasmin, Md Mahmudulla Hassan, Joshua T. Grassel, Harika Bhogaraju, Adolfo R. Escobedo, Olac Fuentes
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2022-06-01
Series:	Frontiers in Artificial Intelligence
Subjects:	machine learning input elicitations crowdsourcing human computation image classification
Online Access:	https://www.frontiersin.org/articles/10.3389/frai.2022.848056/full

_version_	1818218537578135552
author	Romena Yasmin Md Mahmudulla Hassan Joshua T. Grassel Harika Bhogaraju Adolfo R. Escobedo Olac Fuentes
author_facet	Romena Yasmin Md Mahmudulla Hassan Joshua T. Grassel Harika Bhogaraju Adolfo R. Escobedo Olac Fuentes
author_sort	Romena Yasmin
collection	DOAJ
description	This work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the (x, y)-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods.
first_indexed	2024-12-12T07:25:20Z
format	Article
id	doaj.art-eb44110572844cebbf50d701f03d2aba
institution	Directory Open Access Journal
issn	2624-8212
language	English
last_indexed	2024-12-12T07:25:20Z
publishDate	2022-06-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Artificial Intelligence
spelling	doaj.art-eb44110572844cebbf50d701f03d2aba2022-12-22T00:33:10ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122022-06-01510.3389/frai.2022.848056848056Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine LearningRomena Yasmin0Md Mahmudulla Hassan1Joshua T. Grassel2Harika Bhogaraju3Adolfo R. Escobedo4Olac Fuentes5School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United StatesDepartment of Computer Science, University of Texas at El Paso, El Paso, TX, United StatesSchool of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United StatesSchool of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United StatesSchool of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, United StatesDepartment of Computer Science, University of Texas at El Paso, El Paso, TX, United StatesThis work investigates how different forms of input elicitation obtained from crowdsourcing can be utilized to improve the quality of inferred labels for image classification tasks, where an image must be labeled as either positive or negative depending on the presence/absence of a specified object. Five types of input elicitation methods are tested: binary classification (positive or negative); the (x, y)-coordinate of the position participants believe a target object is located; level of confidence in binary response (on a scale from 0 to 100%); what participants believe the majority of the other participants' binary classification is; and participant's perceived difficulty level of the task (on a discrete scale). We design two crowdsourcing studies to test the performance of a variety of input elicitation methods and utilize data from over 300 participants. Various existing voting and machine learning (ML) methods are applied to make the best use of these inputs. In an effort to assess their performance on classification tasks of varying difficulty, a systematic synthetic image generation process is developed. Each generated image combines items from the MPEG-7 Core Experiment CE-Shape-1 Test Set into a single image using multiple parameters (e.g., density, transparency, etc.) and may or may not contain a target object. The difficulty of these images is validated by the performance of an automated image classification method. Experiment results suggest that more accurate results can be achieved with smaller training datasets when both the crowdsourced binary classification labels and the average of the self-reported confidence values in these labels are used as features for the ML classifiers. Moreover, when a relatively larger properly annotated dataset is available, in some cases augmenting these ML algorithms with the results (i.e., probability of outcome) from an automated classifier can achieve even higher performance than what can be obtained by using any one of the individual classifiers. Lastly, supplementary analysis of the collected data demonstrates that other performance metrics of interest, namely reduced false-negative rates, can be prioritized through special modifications of the proposed aggregation methods.https://www.frontiersin.org/articles/10.3389/frai.2022.848056/fullmachine learninginput elicitationscrowdsourcinghuman computationimage classification
spellingShingle	Romena Yasmin Md Mahmudulla Hassan Joshua T. Grassel Harika Bhogaraju Adolfo R. Escobedo Olac Fuentes Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning Frontiers in Artificial Intelligence machine learning input elicitations crowdsourcing human computation image classification
title	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_full	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_fullStr	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_full_unstemmed	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_short	Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning
title_sort	improving crowdsourcing based image classification through expanded input elicitation and machine learning
topic	machine learning input elicitations crowdsourcing human computation image classification
url	https://www.frontiersin.org/articles/10.3389/frai.2022.848056/full
work_keys_str_mv	AT romenayasmin improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT mdmahmudullahassan improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT joshuatgrassel improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT harikabhogaraju improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT adolforescobedo improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning AT olacfuentes improvingcrowdsourcingbasedimageclassificationthroughexpandedinputelicitationandmachinelearning

Improving Crowdsourcing-Based Image Classification Through Expanded Input Elicitation and Machine Learning

Similar Items