Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model

Hope Speech Detection (HSD) from social media is a new direction for promoting and supporting positive content to encourage harmony and positivity in society. As users of social media belong to different linguistic communities, hope speech detection is rarely studied as a multilingual task consideri...

Full description

Bibliographic Details
Main Authors:	Muhammad Shahid Iqbal Malik, Anna Nazarova, Mona Mamdouh Jamjoom, Dmitry I. Ignatov
Format:	Article
Language:	English
Published:	Elsevier 2023-09-01
Series:	Journal of King Saud University: Computer and Information Sciences
Subjects:	Transfer learning Russian XLM-RoBERTa Hope speech Translation-based Multi-lingual
Online Access:	http://www.sciencedirect.com/science/article/pii/S1319157823002902

_version_	1797663862062841856
author	Muhammad Shahid Iqbal Malik Anna Nazarova Mona Mamdouh Jamjoom Dmitry I. Ignatov
author_facet	Muhammad Shahid Iqbal Malik Anna Nazarova Mona Mamdouh Jamjoom Dmitry I. Ignatov
author_sort	Muhammad Shahid Iqbal Malik
collection	DOAJ
description	Hope Speech Detection (HSD) from social media is a new direction for promoting and supporting positive content to encourage harmony and positivity in society. As users of social media belong to different linguistic communities, hope speech detection is rarely studied as a multilingual task considering low-resource languages. Moreover, prior studies explored only monolingual techniques, and the Russian language is not addressed. This study tackles the issue of Multi-lingual Hope Speech Detection (MHSD) in English and Russian languages using the transfer learning paradigm with fine-tuning approach. We explore joint multi-lingual and translation-based approaches to tackle the task of multilingualism, where the latter approach adopts the translation mechanism to transform all content into one language and then classify them. The joint multi-lingual method handles it by designing a universal classifier for various languages. We explore the strengths of the Robustly Optimized BERT Pre-Training Approach (RoBERTa) that showed a benchmark in capturing the semantics and contextual information within the content. The proposed framework consists of several stages: 1) data preprocessing, 2) representation of data using RoBERTa models, 3) fine-tuning phase, and 4) classification of hope speech into two labels. A new Russian corpus for hope speech detection is built, containing YouTube comments. Several experiments are conducted in English and Russian languages by using semi-supervised bilingual English and Russian datasets. The findings show that the proposed framework demonstrated benchmark performance and outperformed the baselines. Furthermore, the translation-based approach (Russian-RoBERTa) offered the best performance by achieving 94% accuracy and 80.24% f1-score.
first_indexed	2024-03-11T19:20:55Z
format	Article
id	doaj.art-487cb3a4f9764839903e257f33e93980
institution	Directory Open Access Journal
issn	1319-1578
language	English
last_indexed	2024-03-11T19:20:55Z
publishDate	2023-09-01
publisher	Elsevier
record_format	Article
series	Journal of King Saud University: Computer and Information Sciences
spelling	doaj.art-487cb3a4f9764839903e257f33e939802023-10-07T04:34:12ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782023-09-01358101736Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa modelMuhammad Shahid Iqbal Malik0Anna Nazarova1Mona Mamdouh Jamjoom2Dmitry I. Ignatov3Department of Computer Science, National Research University Higher School of Economics, 11 Pokrovskiy Boulevard, Moscow 109028, Russian Federation; Corresponding author.Department of Computer Science, National Research University Higher School of Economics, 11 Pokrovskiy Boulevard, Moscow 109028, Russian FederationDepartment of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi ArabiaDepartment of Computer Science, National Research University Higher School of Economics, 11 Pokrovskiy Boulevard, Moscow 109028, Russian FederationHope Speech Detection (HSD) from social media is a new direction for promoting and supporting positive content to encourage harmony and positivity in society. As users of social media belong to different linguistic communities, hope speech detection is rarely studied as a multilingual task considering low-resource languages. Moreover, prior studies explored only monolingual techniques, and the Russian language is not addressed. This study tackles the issue of Multi-lingual Hope Speech Detection (MHSD) in English and Russian languages using the transfer learning paradigm with fine-tuning approach. We explore joint multi-lingual and translation-based approaches to tackle the task of multilingualism, where the latter approach adopts the translation mechanism to transform all content into one language and then classify them. The joint multi-lingual method handles it by designing a universal classifier for various languages. We explore the strengths of the Robustly Optimized BERT Pre-Training Approach (RoBERTa) that showed a benchmark in capturing the semantics and contextual information within the content. The proposed framework consists of several stages: 1) data preprocessing, 2) representation of data using RoBERTa models, 3) fine-tuning phase, and 4) classification of hope speech into two labels. A new Russian corpus for hope speech detection is built, containing YouTube comments. Several experiments are conducted in English and Russian languages by using semi-supervised bilingual English and Russian datasets. The findings show that the proposed framework demonstrated benchmark performance and outperformed the baselines. Furthermore, the translation-based approach (Russian-RoBERTa) offered the best performance by achieving 94% accuracy and 80.24% f1-score.http://www.sciencedirect.com/science/article/pii/S1319157823002902Transfer learningRussianXLM-RoBERTaHope speechTranslation-basedMulti-lingual
spellingShingle	Muhammad Shahid Iqbal Malik Anna Nazarova Mona Mamdouh Jamjoom Dmitry I. Ignatov Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model Journal of King Saud University: Computer and Information Sciences Transfer learning Russian XLM-RoBERTa Hope speech Translation-based Multi-lingual
title	Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model
title_full	Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model
title_fullStr	Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model
title_full_unstemmed	Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model
title_short	Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model
title_sort	multilingual hope speech detection a robust framework using transfer learning of fine tuning roberta model
topic	Transfer learning Russian XLM-RoBERTa Hope speech Translation-based Multi-lingual
url	http://www.sciencedirect.com/science/article/pii/S1319157823002902
work_keys_str_mv	AT muhammadshahidiqbalmalik multilingualhopespeechdetectionarobustframeworkusingtransferlearningoffinetuningrobertamodel AT annanazarova multilingualhopespeechdetectionarobustframeworkusingtransferlearningoffinetuningrobertamodel AT monamamdouhjamjoom multilingualhopespeechdetectionarobustframeworkusingtransferlearningoffinetuningrobertamodel AT dmitryiignatov multilingualhopespeechdetectionarobustframeworkusingtransferlearningoffinetuningrobertamodel

Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model

Similar Items