Summary: | A key problem in crowdsourcing is the aggregation of judgments of proportions. For example, workers might be presented with a news article or an image, and be asked to identify the proportion of each topic, sentiment, object, or colour present in it. These varying judgments then need to be aggregated to form a consensus view of the document’s contents. Often, however, these judgments can be skewed by workers who provide judgments randomly (i.e. they are spammers). Spammers make the cost of acquiring judgments more expensive and degrade the accuracy of the aggregation. For such cases, we provide a new Bayesian framework for aggregating these responses (expressed in the form of categorical distributions) that for the first time accounts for spammers. We elicit 796 judgments about proportions of objects and colours in images. Experimental results on three real-world datasets show comparable aggregation accuracy when 60% of the workers are spammers, as other state of the art approaches do when there are no spammers.
|