HateCheck: functional tests for hate speech detection models

HateCheck: functional tests for hate speech detection models

Detecting online hate is a difficult task that even state-of-the-art models struggle with. Typically, hate speech detection models are evaluated by measuring their performance on held-out test data using metrics such as accuracy and F1 score. However, this approach makes it difficult to identify spe...

Full description

Bibliographic Details
Main Authors:	Röttger, P, Vidgen, B, Dong, N, Waseem, Z, Margetts, H, Pierrehumbert, JB
Format:	Conference item
Language:	English
Published:	Association for Computational Linguistics 2021

Similar Items

Improving the evaluation and effectiveness of hate speech detection models
by: Röttger, P
Published: (2023)

Deciphering implicit hate: evaluating automated detection algorithms for multimodal hate
by: Botelho, A, et al.
Published: (2021)

Detecting weak and strong Islamophobic hate speech on social media
by: Vidgen, B, et al.
Published: (2019)

Detecting weak and strong Islamophobic hate speech on social media
by: Vidgen, B, et al.
Published: (2019)

Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
by: Kirk, H, et al.
Published: (2021)

Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
by: Kirk, HR, et al.
Published: (2022)

Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
by: Ahmed, Z, et al.
Published: (2022)

Putting the lid on hate speech
by: Murad, Dina
Published: (2019)

Harm and responsibility in hate speech
by: Simpson, R
Published: (2013)

Hate speech in public discourse
by: Lepoutre, M
Published: (2017)

Hobbes against hate speech
by: Bejan, T
Published: (2022)

Deep learning techniques for hate speech detection
by: Teng, Yen Fong
Published: (2024)

Deep learning techniques for hate speech detection
by: Lee, Yuan Cheng
Published: (2023)

Deep learning techniques for hate speech detection
by: Sam, Jared Mun Kit
Published: (2023)

Deep learning techniques for hate speech detection
by: Chang, Timothy Zu'En
Published: (2024)

Deep learning techniques for hate speech detection
by: Han, Angel Feng Yi
Published: (2023)

The right to political speech and the ban on hate speech
by: Szigeti, T
Published: (2017)

Legitimacy, hate speech, and viewpoint discrimination
by: Elford, GM
Published: (2020)

Hate speech on the rise: lacunae in Malaysian law
by: Wan Mohd Nor, Murni
Published: (2016)

Hate speech online: an (intractable) contemporary challenge?
by: O'Regan, C
Published: (2018)

Hate speech and the harm in Indonesian judicial decisions
by: Putri, D.K.
Published: (2023)

Socio-legal approaches to online hate speech
by: Stremlau, N, et al.
Published: (2019)

Hate speech laws: expressive power is not the answer
by: Lepoutre, M
Published: (2020)

An Interpretable Approach to Hateful Meme Detection
by: Deshpande, Tanvi, et al.
Published: (2022)

Contextualising hate speech: a study of India And Malaysia
by: Sharma, Ishita
Published: (2019)

State speech as a response to hate speech: Assessing ‘transformative liberalism’
by: Billingham, P
Published: (2019)

Hate speech and LGBT media framing effects among community
by: Mohd Zawawi, Julia Wirza, et al.
Published: (2021)

India’s Hate Speech Pandemic: Communal Intolerance and Sectarian Violence
by: Akanksha, Narain
Published: (2015)

Introduction: hate, offense and free speech in a changing world
by: Billingham, P, et al.
Published: (2010)

Syrian conflict fallout : time to contain hate speech in Indonesia
by: Navhat Nuraniyah
Published: (2014)

Bystanders’ collective responses set the norm against hate speech
by: Zapata, J, et al.
Published: (2024)

The responsible governance of hate speech in app stores: the case of Parler
by: Cowls, J
Published: (2023)

How to talk back: hate speech, misinformation, and the limits of salience
by: Fraser, R
Published: (2023)

Jean Tran hates eggs
by: Nguyen, Ha Thien Kim
Published: (2020)

Regulating hate speech on social media: should we or shouldn`t we?
by: Wan Mohd Nor, Murni, et al.
Published: (2017)

An analysis of hateful contents detection techniques on social media
by: Maw, Maw
Published: (2016)

Hate mail sent to Irish Forum
by: Camden Chronicle, CC, et al.
Published: (1994)

Hate mob clearly can't think
by: Abd Razak, Dzulkifli
Published: (2007)

AIDS : accept the patient, hate the virus.
by: Kaur, Eknam., et al.
Published: (2008)

Countering the ideology of hate : the tripartite approach
by: Mohamed Nawab Mohamed Osman
Published: (2014)