Summary: | Fake news poses a grave threat with devastating consequences in this information-centric age. While advances in data science undeniably hold the key to accurately detecting and curtailing the unfettered spread of fake news, guidance on the selection of algorithms and models that are best suited to a specific fake news scenario leaves much to be desired. Most studies have focused on fake news in a specific domain and employed a limited range of algorithmic techniques. In contrast, the thematic diversity of fake news raises questions over the comprehensiveness of such techniques, whose performance drops when exposed to fake news from a different domain. The current study responds to this call for guidance by focusing on thematically diverse datasets, applying a series of complex algorithms, and performing topic modeling on them. The results demonstrate that ensemble techniques outperform other algorithms, achieving high levels of accuracy of over 98 percent and 95 percent on thematically diverse and pandemic-related datasets, respectively. The study also demonstrates that neural networks are not a panacea for all situations, while topic modeling helps illustrate the lack of coherence in fake news articles. The study offers a distinct perspective on the accuracy of a diverse set of algorithmic approaches and their ability to adapt to an ever-evolving multi-domain world of fake news. A key implication of the study is the unique and comprehensive view of classification performance when exposed to diverse datasets, including pandemic-related news and data from other disciplines, as opposed to its performance on pandemic-related data alone. Our practical contribution is truly the comparative perspective we offer to practitioners when a choice of algorithm is to be made to accurately detect fake news with thematic heterogeneity.
|