Like trainer, like bot? Inheritance of bias in algorithmic content moderation

The internet has become a central medium through which ‘networked publics’ express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers tra...

Szczegółowa specyfikacja

Opis bibliograficzny
Główni autorzy:	Binns, R, Veale, M, Van Kleek, M, Shadbolt, N
Format:	Conference item
Wydane:	Springer 2017

_version_	1826291441981194240
author	Binns, R Veale, M Van Kleek, M Shadbolt, N
author_facet	Binns, R Veale, M Van Kleek, M Shadbolt, N
author_sort	Binns, R
collection	OXFORD
description	The internet has become a central medium through which ‘networked publics’ express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are subject to the idiosyncratic norms of the human raters who provide the training data. An important objective for platforms implementing such measures might be to ensure that they are not unduly biased towards or against particular norms of offence. This paper provides some exploratory methods by which the normative biases of algorithmic content moderation systems can be measured, by way of a case study using an existing dataset of comments labelled for offence. We train classifiers on comments labelled by different demographic subsets (men and women) to understand how differences in conceptions of offence between these groups might affect the performance of the resulting models on various test sets. We conclude by discussing some of the ethical choices facing the implementers of algorithmic moderation systems, given various desired levels of diversity of viewpoints amongst discussion participants.
first_indexed	2024-03-07T02:59:26Z
format	Conference item
id	oxford-uuid:b0723053-f564-4a49-be8f-85e1b0780ea7
institution	University of Oxford
last_indexed	2024-03-07T02:59:26Z
publishDate	2017
publisher	Springer
record_format	dspace
spelling	oxford-uuid:b0723053-f564-4a49-be8f-85e1b0780ea72022-03-27T03:56:32ZLike trainer, like bot? Inheritance of bias in algorithmic content moderationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b0723053-f564-4a49-be8f-85e1b0780ea7Symplectic Elements at OxfordSpringer2017Binns, RVeale, MVan Kleek, MShadbolt, NThe internet has become a central medium through which ‘networked publics’ express their opinions and engage in debate. Offensive comments and personal attacks can inhibit participation in these spaces. Automated content moderation aims to overcome this problem using machine learning classifiers trained on large corpora of texts manually annotated for offence. While such systems could help encourage more civil debate, they must navigate inherently normatively contestable boundaries, and are subject to the idiosyncratic norms of the human raters who provide the training data. An important objective for platforms implementing such measures might be to ensure that they are not unduly biased towards or against particular norms of offence. This paper provides some exploratory methods by which the normative biases of algorithmic content moderation systems can be measured, by way of a case study using an existing dataset of comments labelled for offence. We train classifiers on comments labelled by different demographic subsets (men and women) to understand how differences in conceptions of offence between these groups might affect the performance of the resulting models on various test sets. We conclude by discussing some of the ethical choices facing the implementers of algorithmic moderation systems, given various desired levels of diversity of viewpoints amongst discussion participants.
spellingShingle	Binns, R Veale, M Van Kleek, M Shadbolt, N Like trainer, like bot? Inheritance of bias in algorithmic content moderation
title	Like trainer, like bot? Inheritance of bias in algorithmic content moderation
title_full	Like trainer, like bot? Inheritance of bias in algorithmic content moderation
title_fullStr	Like trainer, like bot? Inheritance of bias in algorithmic content moderation
title_full_unstemmed	Like trainer, like bot? Inheritance of bias in algorithmic content moderation
title_short	Like trainer, like bot? Inheritance of bias in algorithmic content moderation
title_sort	like trainer like bot inheritance of bias in algorithmic content moderation
work_keys_str_mv	AT binnsr liketrainerlikebotinheritanceofbiasinalgorithmiccontentmoderation AT vealem liketrainerlikebotinheritanceofbiasinalgorithmiccontentmoderation AT vankleekm liketrainerlikebotinheritanceofbiasinalgorithmiccontentmoderation AT shadboltn liketrainerlikebotinheritanceofbiasinalgorithmiccontentmoderation

Like trainer, like bot? Inheritance of bias in algorithmic content moderation

Podobne zapisy