Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News

With the increasing number of social media users in recent years, news in various fields, such as politics, economics, and so on, can be easily accessed by users. However, most news spread through social networks including Twitter, Facebook, and Instagram has unknown sources, thus having a significa...

Full description

Bibliographic Details
Main Authors: Jeong-Wook Lee, Jae-Hoon Kim
Format: Article
Language:English
Published: MDPI AG 2022-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/13/6402
_version_ 1797481048982945792
author Jeong-Wook Lee
Jae-Hoon Kim
author_facet Jeong-Wook Lee
Jae-Hoon Kim
author_sort Jeong-Wook Lee
collection DOAJ
description With the increasing number of social media users in recent years, news in various fields, such as politics, economics, and so on, can be easily accessed by users. However, most news spread through social networks including Twitter, Facebook, and Instagram has unknown sources, thus having a significant impact on news consumers. Fake news on COVID-19, which is affecting the global population, is propagating quickly and causes social disorder. Thus, a lot of research is being conducted on the detection of fake news on COVID-19 but is facing the problem of a lack of datasets. In order to alleviate the problem, we built a dataset on COVID-19 fake news from fact-checking websites in Korea and propose deep learning for detecting fake news on COVID-19 using the datasets. The proposed model is pre-trained with large-scale data and then performs transfer learning through a BiLSTM model. Moreover, we propose a method for initializing the hidden and cell states of the BiLSTM model to a [CLS] token instead of a zero vector. Through experiments, the proposed model showed that the accuracy is 78.8%, which was improved by 8% compared with the linear model as a baseline model, and that transfer learning can be useful with a small amount of data as we know it. A [CLS] token containing sentence information as the initial state of the BiLSTM can contribute to a performance improvement in the model.
first_indexed 2024-03-09T22:08:54Z
format Article
id doaj.art-e690b3bf4d8243b8aba6b220fab81380
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T22:08:54Z
publishDate 2022-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-e690b3bf4d8243b8aba6b220fab813802023-11-23T19:36:00ZengMDPI AGApplied Sciences2076-34172022-06-011213640210.3390/app12136402Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake NewsJeong-Wook Lee0Jae-Hoon Kim1Department of Computer Engineering and Interdisciplinary Major of Maritime AI Convergence, Korea Maritime & Ocean University, Busan 49112, KoreaDepartment of Computer Engineering and Interdisciplinary Major of Maritime AI Convergence, Korea Maritime & Ocean University, Busan 49112, KoreaWith the increasing number of social media users in recent years, news in various fields, such as politics, economics, and so on, can be easily accessed by users. However, most news spread through social networks including Twitter, Facebook, and Instagram has unknown sources, thus having a significant impact on news consumers. Fake news on COVID-19, which is affecting the global population, is propagating quickly and causes social disorder. Thus, a lot of research is being conducted on the detection of fake news on COVID-19 but is facing the problem of a lack of datasets. In order to alleviate the problem, we built a dataset on COVID-19 fake news from fact-checking websites in Korea and propose deep learning for detecting fake news on COVID-19 using the datasets. The proposed model is pre-trained with large-scale data and then performs transfer learning through a BiLSTM model. Moreover, we propose a method for initializing the hidden and cell states of the BiLSTM model to a [CLS] token instead of a zero vector. Through experiments, the proposed model showed that the accuracy is 78.8%, which was improved by 8% compared with the linear model as a baseline model, and that transfer learning can be useful with a small amount of data as we know it. A [CLS] token containing sentence information as the initial state of the BiLSTM can contribute to a performance improvement in the model.https://www.mdpi.com/2076-3417/12/13/6402COVID-19fake newsfake news detectiontransfer learningKoCharELECTRA
spellingShingle Jeong-Wook Lee
Jae-Hoon Kim
Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
Applied Sciences
COVID-19
fake news
fake news detection
transfer learning
KoCharELECTRA
title Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
title_full Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
title_fullStr Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
title_full_unstemmed Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
title_short Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
title_sort fake sentence detection based on transfer learning applying to korean covid 19 fake news
topic COVID-19
fake news
fake news detection
transfer learning
KoCharELECTRA
url https://www.mdpi.com/2076-3417/12/13/6402
work_keys_str_mv AT jeongwooklee fakesentencedetectionbasedontransferlearningapplyingtokoreancovid19fakenews
AT jaehoonkim fakesentencedetectionbasedontransferlearningapplyingtokoreancovid19fakenews