HateCheck: functional tests for hate speech detection models
Detecting online hate is a difficult task that even state-of-the-art models struggle with. Typically, hate speech detection models are evaluated by measuring their performance on held-out test data using metrics such as accuracy and F1 score. However, this approach makes it difficult to identify spe...
Main Authors: | Röttger, P, Vidgen, B, Dong, N, Waseem, Z, Margetts, H, Pierrehumbert, JB |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Association for Computational Linguistics
2021
|
Similar Items
-
Improving the evaluation and effectiveness of hate speech detection models
by: Röttger, P
Published: (2023) -
Deciphering implicit hate: evaluating automated detection algorithms for multimodal hate
by: Botelho, A, et al.
Published: (2021) -
Detecting weak and strong Islamophobic hate speech on social media
by: Vidgen, B, et al.
Published: (2019) -
Detecting weak and strong Islamophobic hate speech on social media
by: Vidgen, B, et al.
Published: (2019) -
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
by: Kirk, H, et al.
Published: (2021)