Matching tweets with applicable fact-checks across languages
An important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media post...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
CEUR Workshop Proceedings
2022
|
_version_ | 1797109638031736832 |
---|---|
author | Kazemi, A Li, Z Peréz-Rosas, V Hale, SA Mihalcea, R |
author_facet | Kazemi, A Li, Z Peréz-Rosas, V Hale, SA Mihalcea, R |
author_sort | Kazemi, A |
collection | OXFORD |
description | An important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media posts (tweets). We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models such as XLM-RoBERTa and multilingual embeddings such as LaBSE and SBERT. We present promising results for “match" classification (86% average accuracy) in four language pairs. We also find that a BM25 baseline outperforms or is on par with state-of-the-art multilingual embedding models for the retrieval task during our monolingual experiments. We highlight and discuss NLP challenges while addressing this problem in different languages, and we introduce a novel curated dataset of fact-checks and corresponding tweets for future research. |
first_indexed | 2024-03-07T07:42:52Z |
format | Conference item |
id | oxford-uuid:b17db0f6-169b-4f09-94ce-6913c4cf3cf9 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T07:42:52Z |
publishDate | 2022 |
publisher | CEUR Workshop Proceedings |
record_format | dspace |
spelling | oxford-uuid:b17db0f6-169b-4f09-94ce-6913c4cf3cf92023-05-12T10:34:10ZMatching tweets with applicable fact-checks across languagesConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b17db0f6-169b-4f09-94ce-6913c4cf3cf9EnglishSymplectic ElementsCEUR Workshop Proceedings2022Kazemi, ALi, ZPeréz-Rosas, VHale, SAMihalcea, RAn important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media posts (tweets). We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models such as XLM-RoBERTa and multilingual embeddings such as LaBSE and SBERT. We present promising results for “match" classification (86% average accuracy) in four language pairs. We also find that a BM25 baseline outperforms or is on par with state-of-the-art multilingual embedding models for the retrieval task during our monolingual experiments. We highlight and discuss NLP challenges while addressing this problem in different languages, and we introduce a novel curated dataset of fact-checks and corresponding tweets for future research. |
spellingShingle | Kazemi, A Li, Z Peréz-Rosas, V Hale, SA Mihalcea, R Matching tweets with applicable fact-checks across languages |
title | Matching tweets with applicable fact-checks across languages |
title_full | Matching tweets with applicable fact-checks across languages |
title_fullStr | Matching tweets with applicable fact-checks across languages |
title_full_unstemmed | Matching tweets with applicable fact-checks across languages |
title_short | Matching tweets with applicable fact-checks across languages |
title_sort | matching tweets with applicable fact checks across languages |
work_keys_str_mv | AT kazemia matchingtweetswithapplicablefactchecksacrosslanguages AT liz matchingtweetswithapplicablefactchecksacrosslanguages AT perezrosasv matchingtweetswithapplicablefactchecksacrosslanguages AT halesa matchingtweetswithapplicablefactchecksacrosslanguages AT mihalcear matchingtweetswithapplicablefactchecksacrosslanguages |