Deep vs. Shallow: A Comparative Study of Machine Learning and Deep Learning Approaches for Fake Health News Detection

Internet explosion and penetration have amplified the fake news problem that existed even before Internet penetration. This becomes more of a concern, if the news is health-related. To address this issue, this research proposes Content Based Models (CBM) and Feature Based Models (FBM). The differenc...

Full description

Bibliographic Details
Main Authors: Tripti Mahara, V. L. Helen Josephine, Rashmi Srinivasan, Poorvi Prakash, Abeer D. Algarni, Om Prakash Verma
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10192389/
Description
Summary:Internet explosion and penetration have amplified the fake news problem that existed even before Internet penetration. This becomes more of a concern, if the news is health-related. To address this issue, this research proposes Content Based Models (CBM) and Feature Based Models (FBM). The difference between the two models lies in the input provided. The CBM only takes news content as the input, whereas the FBM along with the content also takes two readability features as the input. Under each category, the performance of five traditional machine learning techniques: - Decision Tree, Random Forest, Support Vector Machine, AdaBoost-Decision Tree and AdaBoost-Random Forest is compared with two hybrid Deep Learning approaches, namely CNN-LSTM and CNN-BiLSTM. The Fake News Healthcare dataset comprising 9581 articles was utilized for the study. Easy Data Augmentation technique is used to balance this highly imbalanced dataset. The experimental results demonstrate that Feature Based Models perform better than Content Based Models. Among the proposed FBM, the Hybrid CNN - LSTM model had a F1 score of 97.09% and AdaBoost-Random Forest had a F1 Score of 98.9%. Thus, Adaboost-Random Forest under FBM is the best-performing model for the classification of fake news.
ISSN:2169-3536