Construction of multi-modal social media dataset for fake news detection

The advent of social media has brought about significant changes in people’s lives.While social media allows for easy access and sharing of news, it has also become a breeding ground for the dissemination of fake news, posing a serious threat to social security and stability.Consequently, researcher...

Full description

Bibliographic Details
Main Author: Guopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QIN
Format: Article
Language:English
Published: POSTS&TELECOM PRESS Co., LTD 2023-08-01
Series:网络与信息安全学报
Subjects:
Online Access:https://www.infocomm-journal.com/cjnis/CN/10.11959/j.issn.2096-109x.2023060
_version_ 1797262417663623168
author Guopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QIN
author_facet Guopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QIN
author_sort Guopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QIN
collection DOAJ
description The advent of social media has brought about significant changes in people’s lives.While social media allows for easy access and sharing of news, it has also become a breeding ground for the dissemination of fake news, posing a serious threat to social security and stability.Consequently, researchers have shifted their focus towards fake news detection.Although several deep learning-based solutions have been proposed, these methods heavily rely on large amounts of supporting data.Currently, there is a scarcity of existing datasets, particularly in Chinese, and the collected news articles are often limited to the same category.To enhance the detection of fake news, a new multi-modal fake news dataset (MFND) was developed, which comprised Chinese and English news data from ten diverse categories: politics, economy, entertainment, sports, international affairs, technology, military, education, health, and social life.The word frequencies and categories of the proposed fake news dataset were analyzed and compared with existing fake news datasets in terms of number of news, news categories, modal information and news languages.The results of the comparison demonstrate that the MFND dataset excels in terms of category information and news languages.Moreover, training and validating existing typical fake news detection methods with MFND dataset, the experimental results show an improvement of approximately 10% in model performance compared to existing mainstream fake news datasets.
first_indexed 2024-04-24T23:56:47Z
format Article
id doaj.art-05af67358f6e4dec947dc9346cbbbdfa
institution Directory Open Access Journal
issn 2096-109X
language English
last_indexed 2024-04-24T23:56:47Z
publishDate 2023-08-01
publisher POSTS&TELECOM PRESS Co., LTD
record_format Article
series 网络与信息安全学报
spelling doaj.art-05af67358f6e4dec947dc9346cbbbdfa2024-03-14T11:38:04ZengPOSTS&TELECOM PRESS Co., LTD网络与信息安全学报2096-109X2023-08-019414415410.11959/j.issn.2096-109x.2023060Construction of multi-modal social media dataset for fake news detectionGuopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QINThe advent of social media has brought about significant changes in people’s lives.While social media allows for easy access and sharing of news, it has also become a breeding ground for the dissemination of fake news, posing a serious threat to social security and stability.Consequently, researchers have shifted their focus towards fake news detection.Although several deep learning-based solutions have been proposed, these methods heavily rely on large amounts of supporting data.Currently, there is a scarcity of existing datasets, particularly in Chinese, and the collected news articles are often limited to the same category.To enhance the detection of fake news, a new multi-modal fake news dataset (MFND) was developed, which comprised Chinese and English news data from ten diverse categories: politics, economy, entertainment, sports, international affairs, technology, military, education, health, and social life.The word frequencies and categories of the proposed fake news dataset were analyzed and compared with existing fake news datasets in terms of number of news, news categories, modal information and news languages.The results of the comparison demonstrate that the MFND dataset excels in terms of category information and news languages.Moreover, training and validating existing typical fake news detection methods with MFND dataset, the experimental results show an improvement of approximately 10% in model performance compared to existing mainstream fake news datasets.https://www.infocomm-journal.com/cjnis/CN/10.11959/j.issn.2096-109x.2023060social mediafake news detectionmulti-modalmulti-categorydataset
spellingShingle Guopeng GAO, Yaodong FANG, Yanfang HAN, Zhenxing QIAN, Chuan QIN
Construction of multi-modal social media dataset for fake news detection
网络与信息安全学报
social media
fake news detection
multi-modal
multi-category
dataset
title Construction of multi-modal social media dataset for fake news detection
title_full Construction of multi-modal social media dataset for fake news detection
title_fullStr Construction of multi-modal social media dataset for fake news detection
title_full_unstemmed Construction of multi-modal social media dataset for fake news detection
title_short Construction of multi-modal social media dataset for fake news detection
title_sort construction of multi modal social media dataset for fake news detection
topic social media
fake news detection
multi-modal
multi-category
dataset
url https://www.infocomm-journal.com/cjnis/CN/10.11959/j.issn.2096-109x.2023060
work_keys_str_mv AT guopenggaoyaodongfangyanfanghanzhenxingqianchuanqin constructionofmultimodalsocialmediadatasetforfakenewsdetection