Selective Feature Sets Based Fake News Detection for COVID-19 to Manage Infodemic

During the COVID-19 pandemic, the spread of fake news became easy due to the wide use of social media platforms. Considering the problematic consequences of fake news, efforts have been made for the timely detection of fake news using machine learning and deep learning models. Such works focus on mo...

Full description

Bibliographic Details
Main Authors: Manideep Narra, Muhammad Umer, Saima Sadiq, Ala' Abdulmajid Eshmawi, Hanen Karamti, Abdullah Mohamed, Imran Ashraf
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9893055/
Description
Summary:During the COVID-19 pandemic, the spread of fake news became easy due to the wide use of social media platforms. Considering the problematic consequences of fake news, efforts have been made for the timely detection of fake news using machine learning and deep learning models. Such works focus on model optimization and feature engineering and the extraction part is under-explored area. Therefore, the primary objective of this study is to investigate the impact of features to obtain high performance. For this purpose, this study analyzes the impact of different subset feature selection techniques on the performance of models for fake news detection. Principal component analysis and Chi-square are investigated for feature selection using machine learning and pre-trained deep learning models. Additionally, the influence of different preprocessing steps is also analyzed regarding fake news detection. Results obtained from comprehensive experiments reveal that the extra tree classifier outperforms with a 0.9474 accuracy when trained on the combination of term frequency-inverse document frequency and bag of words features. Models tend to yield poor results if no preprocessing or partial processing is carried out. Convolutional neural network, long short term memory network, residual neural network (ResNet), and InceptionV3 show marginally lower performance than the extra tree classifier. Results reveal that using subset features also helps to achieve robustness for machine learning models.
ISSN:2169-3536