Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
Detecting online hate is a complex task, and low-performing detection models have harmful consequences when used for sensitive applications such as content moderation. Emoji-based hate is a key emerging challenge for online hate detection. We present HatemojiCheck, a test suite of 3,930 short-form s...
Main Authors: | Kirk, H, Vidgen, B, Röttger, P, Hale, SA |
---|---|
Format: | Working paper |
Language: | English |
Published: |
2021
|
Similar Items
-
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
by: Kirk, HR, et al.
Published: (2022) -
HateCheck: functional tests for hate speech detection models
by: Röttger, P, et al.
Published: (2021) -
Deciphering implicit hate: evaluating automated detection algorithms for multimodal hate
by: Botelho, A, et al.
Published: (2021) -
Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
by: Ahmed, Z, et al.
Published: (2022) -
The benefits, risks and bounds of personalizing the alignment of large language models to individuals
by: Kirk, HR, et al.
Published: (2024)