Fake news detection: a systematic literature review of machine learning algorithms and datasets

Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information techno...

Full description

Bibliographic Details
Main Authors: Humberto Fernandes Villela, Fábio Corrêa, Jurema Suely de Araújo Nery Ribeiro, Air Rabelo, Dárlinton Barbosa Feres Carvalho
Format: Article
Language:English
Published: Brazilian Computer Society 2023-03-01
Series:Journal on Interactive Systems
Subjects:
Online Access:https://sol.sbc.org.br/journals/index.php/jis/article/view/3020
_version_ 1797870135980064768
author Humberto Fernandes Villela
Fábio Corrêa
Jurema Suely de Araújo Nery Ribeiro
Air Rabelo
Dárlinton Barbosa Feres Carvalho
author_facet Humberto Fernandes Villela
Fábio Corrêa
Jurema Suely de Araújo Nery Ribeiro
Air Rabelo
Dárlinton Barbosa Feres Carvalho
author_sort Humberto Fernandes Villela
collection DOAJ
description Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information technologies, they are widely propagated through social media, being their use intentional and challenging to identify. In order to mitigate the damage caused by fake news, researchers have been seeking the development of automated mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in this development. This research aims to analyze the machine learning algorithms and datasets used in training to identify fake news published in the literature. It is exploratory research with a qualitative approach, which uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive, most of the research employed datasets in controlled environments (e.g., Kaggle) or without information updated in real-time (from social networks). Still, only a few studies have been applied in social network environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19 Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics, the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods for identifying fake news.
first_indexed 2024-04-10T00:22:34Z
format Article
id doaj.art-d4276c0dbfc94a398e9c973e4166e82c
institution Directory Open Access Journal
issn 2763-7719
language English
last_indexed 2024-04-10T00:22:34Z
publishDate 2023-03-01
publisher Brazilian Computer Society
record_format Article
series Journal on Interactive Systems
spelling doaj.art-d4276c0dbfc94a398e9c973e4166e82c2023-03-15T16:57:14ZengBrazilian Computer SocietyJournal on Interactive Systems2763-77192023-03-0114110.5753/jis.2023.3020Fake news detection: a systematic literature review of machine learning algorithms and datasetsHumberto Fernandes Villela0Fábio Corrêa1Jurema Suely de Araújo Nery Ribeiro2Air Rabelo3Dárlinton Barbosa Feres Carvalho4Universidade FUMECUniversidade FUMECUniversidade FUMECUniversidade FUMECUniversidade Federal de São João del-ReiFake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information technologies, they are widely propagated through social media, being their use intentional and challenging to identify. In order to mitigate the damage caused by fake news, researchers have been seeking the development of automated mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in this development. This research aims to analyze the machine learning algorithms and datasets used in training to identify fake news published in the literature. It is exploratory research with a qualitative approach, which uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive, most of the research employed datasets in controlled environments (e.g., Kaggle) or without information updated in real-time (from social networks). Still, only a few studies have been applied in social network environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19 Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics, the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods for identifying fake news. https://sol.sbc.org.br/journals/index.php/jis/article/view/3020Algorithmsdatasetsaccuracyfake newsartificial intelligence
spellingShingle Humberto Fernandes Villela
Fábio Corrêa
Jurema Suely de Araújo Nery Ribeiro
Air Rabelo
Dárlinton Barbosa Feres Carvalho
Fake news detection: a systematic literature review of machine learning algorithms and datasets
Journal on Interactive Systems
Algorithms
datasets
accuracy
fake news
artificial intelligence
title Fake news detection: a systematic literature review of machine learning algorithms and datasets
title_full Fake news detection: a systematic literature review of machine learning algorithms and datasets
title_fullStr Fake news detection: a systematic literature review of machine learning algorithms and datasets
title_full_unstemmed Fake news detection: a systematic literature review of machine learning algorithms and datasets
title_short Fake news detection: a systematic literature review of machine learning algorithms and datasets
title_sort fake news detection a systematic literature review of machine learning algorithms and datasets
topic Algorithms
datasets
accuracy
fake news
artificial intelligence
url https://sol.sbc.org.br/journals/index.php/jis/article/view/3020
work_keys_str_mv AT humbertofernandesvillela fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets
AT fabiocorrea fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets
AT juremasuelydearaujoneryribeiro fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets
AT airrabelo fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets
AT darlintonbarbosaferescarvalho fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets