Fake news detection: a systematic literature review of machine learning algorithms and datasets
Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information techno...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Brazilian Computer Society
2023-03-01
|
Series: | Journal on Interactive Systems |
Subjects: | |
Online Access: | https://sol.sbc.org.br/journals/index.php/jis/article/view/3020 |
_version_ | 1797870135980064768 |
---|---|
author | Humberto Fernandes Villela Fábio Corrêa Jurema Suely de Araújo Nery Ribeiro Air Rabelo Dárlinton Barbosa Feres Carvalho |
author_facet | Humberto Fernandes Villela Fábio Corrêa Jurema Suely de Araújo Nery Ribeiro Air Rabelo Dárlinton Barbosa Feres Carvalho |
author_sort | Humberto Fernandes Villela |
collection | DOAJ |
description | Fake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information technologies, they are widely propagated through social media, being their use intentional and challenging to identify. In order to mitigate the damage caused by fake news, researchers have been seeking the development of automated mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in this development. This research aims to analyze the machine learning algorithms and datasets used in training to identify fake news published in the literature. It is exploratory research with a qualitative approach, which uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive, most of the research employed datasets in controlled environments (e.g., Kaggle) or without information updated in real-time (from social networks). Still, only a few studies have been applied in social network environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19 Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics, the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods for identifying fake news.
|
first_indexed | 2024-04-10T00:22:34Z |
format | Article |
id | doaj.art-d4276c0dbfc94a398e9c973e4166e82c |
institution | Directory Open Access Journal |
issn | 2763-7719 |
language | English |
last_indexed | 2024-04-10T00:22:34Z |
publishDate | 2023-03-01 |
publisher | Brazilian Computer Society |
record_format | Article |
series | Journal on Interactive Systems |
spelling | doaj.art-d4276c0dbfc94a398e9c973e4166e82c2023-03-15T16:57:14ZengBrazilian Computer SocietyJournal on Interactive Systems2763-77192023-03-0114110.5753/jis.2023.3020Fake news detection: a systematic literature review of machine learning algorithms and datasetsHumberto Fernandes Villela0Fábio Corrêa1Jurema Suely de Araújo Nery Ribeiro2Air Rabelo3Dárlinton Barbosa Feres Carvalho4Universidade FUMECUniversidade FUMECUniversidade FUMECUniversidade FUMECUniversidade Federal de São João del-ReiFake news (i.e., false news created to have a high capacity for dissemination and malicious intentions) is a problem of great interest to society today since it has achieved unprecedented political, economic, and social impacts. Taking advantage of modern digital communication and information technologies, they are widely propagated through social media, being their use intentional and challenging to identify. In order to mitigate the damage caused by fake news, researchers have been seeking the development of automated mechanisms to detect them, such as algorithms based on machine learning as well as the datasets employed in this development. This research aims to analyze the machine learning algorithms and datasets used in training to identify fake news published in the literature. It is exploratory research with a qualitative approach, which uses a research protocol to identify studies with the intention of analyzing them. As a result, we have the algorithms Stacking Method, Bidirectional Recurrent Neural Network (BiRNN), and Convolutional Neural Network (CNN), with 99.9%, 99.8%, and 99.8% accuracy, respectively. Although this accuracy is expressive, most of the research employed datasets in controlled environments (e.g., Kaggle) or without information updated in real-time (from social networks). Still, only a few studies have been applied in social network environments, where the most significant dissemination of disinformation occurs nowadays. Kaggle was the platform identified with the most frequently used datasets, being succeeded by Weibo, FNC-1, COVID-19 Fake News, and Twitter. For future research, studies should be carried out in addition to news about politics, the area that was the primary motivator for the growth of research from 2017, and the use of hybrid methods for identifying fake news. https://sol.sbc.org.br/journals/index.php/jis/article/view/3020Algorithmsdatasetsaccuracyfake newsartificial intelligence |
spellingShingle | Humberto Fernandes Villela Fábio Corrêa Jurema Suely de Araújo Nery Ribeiro Air Rabelo Dárlinton Barbosa Feres Carvalho Fake news detection: a systematic literature review of machine learning algorithms and datasets Journal on Interactive Systems Algorithms datasets accuracy fake news artificial intelligence |
title | Fake news detection: a systematic literature review of machine learning algorithms and datasets |
title_full | Fake news detection: a systematic literature review of machine learning algorithms and datasets |
title_fullStr | Fake news detection: a systematic literature review of machine learning algorithms and datasets |
title_full_unstemmed | Fake news detection: a systematic literature review of machine learning algorithms and datasets |
title_short | Fake news detection: a systematic literature review of machine learning algorithms and datasets |
title_sort | fake news detection a systematic literature review of machine learning algorithms and datasets |
topic | Algorithms datasets accuracy fake news artificial intelligence |
url | https://sol.sbc.org.br/journals/index.php/jis/article/view/3020 |
work_keys_str_mv | AT humbertofernandesvillela fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets AT fabiocorrea fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets AT juremasuelydearaujoneryribeiro fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets AT airrabelo fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets AT darlintonbarbosaferescarvalho fakenewsdetectionasystematicliteraturereviewofmachinelearningalgorithmsanddatasets |