Matching tweets with applicable fact-checks across languages

An important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media post...

Full description

Bibliographic Details
Main Authors: Kazemi, A, Li, Z, Peréz-Rosas, V, Hale, SA, Mihalcea, R
Format: Conference item
Language:English
Published: CEUR Workshop Proceedings 2022
_version_ 1797109638031736832
author Kazemi, A
Li, Z
Peréz-Rosas, V
Hale, SA
Mihalcea, R
author_facet Kazemi, A
Li, Z
Peréz-Rosas, V
Hale, SA
Mihalcea, R
author_sort Kazemi, A
collection OXFORD
description An important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media posts (tweets). We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models such as XLM-RoBERTa and multilingual embeddings such as LaBSE and SBERT. We present promising results for “match" classification (86% average accuracy) in four language pairs. We also find that a BM25 baseline outperforms or is on par with state-of-the-art multilingual embedding models for the retrieval task during our monolingual experiments. We highlight and discuss NLP challenges while addressing this problem in different languages, and we introduce a novel curated dataset of fact-checks and corresponding tweets for future research.
first_indexed 2024-03-07T07:42:52Z
format Conference item
id oxford-uuid:b17db0f6-169b-4f09-94ce-6913c4cf3cf9
institution University of Oxford
language English
last_indexed 2024-03-07T07:42:52Z
publishDate 2022
publisher CEUR Workshop Proceedings
record_format dspace
spelling oxford-uuid:b17db0f6-169b-4f09-94ce-6913c4cf3cf92023-05-12T10:34:10ZMatching tweets with applicable fact-checks across languagesConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b17db0f6-169b-4f09-94ce-6913c4cf3cf9EnglishSymplectic ElementsCEUR Workshop Proceedings2022Kazemi, ALi, ZPeréz-Rosas, VHale, SAMihalcea, RAn important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media posts (tweets). We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models such as XLM-RoBERTa and multilingual embeddings such as LaBSE and SBERT. We present promising results for “match" classification (86% average accuracy) in four language pairs. We also find that a BM25 baseline outperforms or is on par with state-of-the-art multilingual embedding models for the retrieval task during our monolingual experiments. We highlight and discuss NLP challenges while addressing this problem in different languages, and we introduce a novel curated dataset of fact-checks and corresponding tweets for future research.
spellingShingle Kazemi, A
Li, Z
Peréz-Rosas, V
Hale, SA
Mihalcea, R
Matching tweets with applicable fact-checks across languages
title Matching tweets with applicable fact-checks across languages
title_full Matching tweets with applicable fact-checks across languages
title_fullStr Matching tweets with applicable fact-checks across languages
title_full_unstemmed Matching tweets with applicable fact-checks across languages
title_short Matching tweets with applicable fact-checks across languages
title_sort matching tweets with applicable fact checks across languages
work_keys_str_mv AT kazemia matchingtweetswithapplicablefactchecksacrosslanguages
AT liz matchingtweetswithapplicablefactchecksacrosslanguages
AT perezrosasv matchingtweetswithapplicablefactchecksacrosslanguages
AT halesa matchingtweetswithapplicablefactchecksacrosslanguages
AT mihalcear matchingtweetswithapplicablefactchecksacrosslanguages