HateCheck: functional tests for hate speech detection models
Detecting online hate is a difficult task that even state-of-the-art models struggle with. Typically, hate speech detection models are evaluated by measuring their performance on held-out test data using metrics such as accuracy and F1 score. However, this approach makes it difficult to identify spe...
Main Authors: | , , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Association for Computational Linguistics
2021
|