Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News
With the increasing number of social media users in recent years, news in various fields, such as politics, economics, and so on, can be easily accessed by users. However, most news spread through social networks including Twitter, Facebook, and Instagram has unknown sources, thus having a significa...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/13/6402 |
_version_ | 1797481048982945792 |
---|---|
author | Jeong-Wook Lee Jae-Hoon Kim |
author_facet | Jeong-Wook Lee Jae-Hoon Kim |
author_sort | Jeong-Wook Lee |
collection | DOAJ |
description | With the increasing number of social media users in recent years, news in various fields, such as politics, economics, and so on, can be easily accessed by users. However, most news spread through social networks including Twitter, Facebook, and Instagram has unknown sources, thus having a significant impact on news consumers. Fake news on COVID-19, which is affecting the global population, is propagating quickly and causes social disorder. Thus, a lot of research is being conducted on the detection of fake news on COVID-19 but is facing the problem of a lack of datasets. In order to alleviate the problem, we built a dataset on COVID-19 fake news from fact-checking websites in Korea and propose deep learning for detecting fake news on COVID-19 using the datasets. The proposed model is pre-trained with large-scale data and then performs transfer learning through a BiLSTM model. Moreover, we propose a method for initializing the hidden and cell states of the BiLSTM model to a [CLS] token instead of a zero vector. Through experiments, the proposed model showed that the accuracy is 78.8%, which was improved by 8% compared with the linear model as a baseline model, and that transfer learning can be useful with a small amount of data as we know it. A [CLS] token containing sentence information as the initial state of the BiLSTM can contribute to a performance improvement in the model. |
first_indexed | 2024-03-09T22:08:54Z |
format | Article |
id | doaj.art-e690b3bf4d8243b8aba6b220fab81380 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T22:08:54Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-e690b3bf4d8243b8aba6b220fab813802023-11-23T19:36:00ZengMDPI AGApplied Sciences2076-34172022-06-011213640210.3390/app12136402Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake NewsJeong-Wook Lee0Jae-Hoon Kim1Department of Computer Engineering and Interdisciplinary Major of Maritime AI Convergence, Korea Maritime & Ocean University, Busan 49112, KoreaDepartment of Computer Engineering and Interdisciplinary Major of Maritime AI Convergence, Korea Maritime & Ocean University, Busan 49112, KoreaWith the increasing number of social media users in recent years, news in various fields, such as politics, economics, and so on, can be easily accessed by users. However, most news spread through social networks including Twitter, Facebook, and Instagram has unknown sources, thus having a significant impact on news consumers. Fake news on COVID-19, which is affecting the global population, is propagating quickly and causes social disorder. Thus, a lot of research is being conducted on the detection of fake news on COVID-19 but is facing the problem of a lack of datasets. In order to alleviate the problem, we built a dataset on COVID-19 fake news from fact-checking websites in Korea and propose deep learning for detecting fake news on COVID-19 using the datasets. The proposed model is pre-trained with large-scale data and then performs transfer learning through a BiLSTM model. Moreover, we propose a method for initializing the hidden and cell states of the BiLSTM model to a [CLS] token instead of a zero vector. Through experiments, the proposed model showed that the accuracy is 78.8%, which was improved by 8% compared with the linear model as a baseline model, and that transfer learning can be useful with a small amount of data as we know it. A [CLS] token containing sentence information as the initial state of the BiLSTM can contribute to a performance improvement in the model.https://www.mdpi.com/2076-3417/12/13/6402COVID-19fake newsfake news detectiontransfer learningKoCharELECTRA |
spellingShingle | Jeong-Wook Lee Jae-Hoon Kim Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News Applied Sciences COVID-19 fake news fake news detection transfer learning KoCharELECTRA |
title | Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News |
title_full | Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News |
title_fullStr | Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News |
title_full_unstemmed | Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News |
title_short | Fake Sentence Detection Based on Transfer Learning: Applying to Korean COVID-19 Fake News |
title_sort | fake sentence detection based on transfer learning applying to korean covid 19 fake news |
topic | COVID-19 fake news fake news detection transfer learning KoCharELECTRA |
url | https://www.mdpi.com/2076-3417/12/13/6402 |
work_keys_str_mv | AT jeongwooklee fakesentencedetectionbasedontransferlearningapplyingtokoreancovid19fakenews AT jaehoonkim fakesentencedetectionbasedontransferlearningapplyingtokoreancovid19fakenews |