HateCheck: functional tests for hate speech detection models

Detecting online hate is a difficult task that even state-of-the-art models struggle with. Typically, hate speech detection models are evaluated by measuring their performance on held-out test data using metrics such as accuracy and F1 score. However, this approach makes it difficult to identify spe...

Mô tả đầy đủ

Chi tiết về thư mục
Những tác giả chính: Röttger, P, Vidgen, B, Dong, N, Waseem, Z, Margetts, H, Pierrehumbert, JB
Định dạng: Conference item
Ngôn ngữ:English
Được phát hành: Association for Computational Linguistics 2021