Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning

Online hate is a growing concern on many social media platforms, making them unwelcoming and unsafe. To combat this, technology companies are increasingly developing techniques to automatically identify and sanction hateful users. However, accurate detection of such users remains a challenge due to...

Full description

Bibliographic Details
Main Authors:	Ahmed, Z, Vidgen, B, Hale, SA
Format:	Journal article
Language:	English
Published:	Springer Nature 2022

_version_	1797106740074905600
author	Ahmed, Z Vidgen, B Hale, SA
author_facet	Ahmed, Z Vidgen, B Hale, SA
author_sort	Ahmed, Z
collection	OXFORD
description	Online hate is a growing concern on many social media platforms, making them unwelcoming and unsafe. To combat this, technology companies are increasingly developing techniques to automatically identify and sanction hateful users. However, accurate detection of such users remains a challenge due to the contextual nature of speech, whose meaning depends on the social setting in which it is used. This contextual nature of speech has also led to minoritized users, especially African–Americans, to be unfairly detected as ‘hateful’ by the very algorithms designed to protect them. To resolve this problem of inaccurate and unfair hate detection, research has focused on developing machine learning (ML) systems that better understand textual context. Incorporating social networks of hateful users has not received as much attention, despite social science research suggesting it provides rich contextual information. We present a system for more accurately and fairly detecting hateful users by incorporating social network information through geometric deep learning. Geometric deep learning is a ML technique that dynamically learns information-rich network representations. We make two main contributions: first, we demonstrate that adding network information with geometric deep learning produces a more accurate classifier compared with other techniques that either exclude network information entirely or incorporate it through manual feature engineering. Our best performing model achieves an AUC score of 90.8% on a previously released hateful user dataset. Second, we show that such information also leads to fairer outcomes: using the ‘predictive equality’ fairness criteria, we compare the false positive rates of our geometric learning algorithm to other ML techniques and find that our best-performing classifier has no false positives among a subset of African–American users. A neural network without network information has the largest number of false positives at 26, while a neural network incorporating manual network features has 13 false positives among African–American users. The system we present highlights the importance of effectively incorporating social network features in automated hateful user detection, raising new opportunities to improve how online hate is tackled.
first_indexed	2024-03-07T07:06:49Z
format	Journal article
id	oxford-uuid:d5686e74-f66c-462e-9d2d-850fca8cb205
institution	University of Oxford
language	English
last_indexed	2024-03-07T07:06:49Z
publishDate	2022
publisher	Springer Nature
record_format	dspace
spelling	oxford-uuid:d5686e74-f66c-462e-9d2d-850fca8cb2052022-05-09T13:13:15ZTackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learningJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:d5686e74-f66c-462e-9d2d-850fca8cb205EnglishSymplectic ElementsSpringer Nature2022Ahmed, ZVidgen, BHale, SAOnline hate is a growing concern on many social media platforms, making them unwelcoming and unsafe. To combat this, technology companies are increasingly developing techniques to automatically identify and sanction hateful users. However, accurate detection of such users remains a challenge due to the contextual nature of speech, whose meaning depends on the social setting in which it is used. This contextual nature of speech has also led to minoritized users, especially African–Americans, to be unfairly detected as ‘hateful’ by the very algorithms designed to protect them. To resolve this problem of inaccurate and unfair hate detection, research has focused on developing machine learning (ML) systems that better understand textual context. Incorporating social networks of hateful users has not received as much attention, despite social science research suggesting it provides rich contextual information. We present a system for more accurately and fairly detecting hateful users by incorporating social network information through geometric deep learning. Geometric deep learning is a ML technique that dynamically learns information-rich network representations. We make two main contributions: first, we demonstrate that adding network information with geometric deep learning produces a more accurate classifier compared with other techniques that either exclude network information entirely or incorporate it through manual feature engineering. Our best performing model achieves an AUC score of 90.8% on a previously released hateful user dataset. Second, we show that such information also leads to fairer outcomes: using the ‘predictive equality’ fairness criteria, we compare the false positive rates of our geometric learning algorithm to other ML techniques and find that our best-performing classifier has no false positives among a subset of African–American users. A neural network without network information has the largest number of false positives at 26, while a neural network incorporating manual network features has 13 false positives among African–American users. The system we present highlights the importance of effectively incorporating social network features in automated hateful user detection, raising new opportunities to improve how online hate is tackled.
spellingShingle	Ahmed, Z Vidgen, B Hale, SA Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
title	Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
title_full	Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
title_fullStr	Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
title_full_unstemmed	Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
title_short	Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
title_sort	tackling racial bias in automated online hate detection towards fair and accurate detection of hateful users with geometric deep learning
work_keys_str_mv	AT ahmedz tacklingracialbiasinautomatedonlinehatedetectiontowardsfairandaccuratedetectionofhatefuluserswithgeometricdeeplearning AT vidgenb tacklingracialbiasinautomatedonlinehatedetectiontowardsfairandaccuratedetectionofhatefuluserswithgeometricdeeplearning AT halesa tacklingracialbiasinautomatedonlinehatedetectiontowardsfairandaccuratedetectionofhatefuluserswithgeometricdeeplearning

Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning

Similar Items