Rating news claims: Feature selection and evaluation

News claims that travel the Internet and online social networks (OSNs) originate from different, sometimes unknown sources, which raises issues related to the credibility of those claims and the drivers behind them. Fact-checking websites such as Snopes, FactCheck, and Emergent use human evaluators...

Full description

Bibliographic Details
Main Authors: Izzat Alsmadi, Michael J. O'Brien
Format: Article
Language:English
Published: AIMS Press 2020-01-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2020101?viewType=HTML
_version_ 1819153206160130048
author Izzat Alsmadi
Michael J. O'Brien
author_facet Izzat Alsmadi
Michael J. O'Brien
author_sort Izzat Alsmadi
collection DOAJ
description News claims that travel the Internet and online social networks (OSNs) originate from different, sometimes unknown sources, which raises issues related to the credibility of those claims and the drivers behind them. Fact-checking websites such as Snopes, FactCheck, and Emergent use human evaluators to investigate and label news claims, but the process is labor- and time-intensive. Driven by the need to use data analytics and algorithms in assessing the credibility of news claims, we focus on what can be generalized about evaluating human-labeled claims. We developed tools to extract claims from Snopes and Emergent and used public datasets collected by and published on those websites. Claims extracted from those datasets were supervised or labeled with different claim ratings. We focus on claims with definite ratings—false, mostly false, true, and mostly true, with the goal of identifying distinctive features that can be used to distinguish true from false claims. Ultimately, those features can be used to predict future unsupervised or unlabeled claims. We evaluate different methods to extract features as well as different sets of features and their ability to predict the correct claim label. By far, we noticed that OSN websites report high rates of false claims in comparison with most of the other website categories. The rate of reported false claims is higher than the rate of true claims in fact-checking websites in most categories. At the content-analysis level, false claims tend to have more negative tones in sentiments and hence can provide supporting features to predict claim classification.
first_indexed 2024-12-22T15:01:30Z
format Article
id doaj.art-549d1753fec546df8ab5f7544581cf3a
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-12-22T15:01:30Z
publishDate 2020-01-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-549d1753fec546df8ab5f7544581cf3a2022-12-21T18:22:07ZengAIMS PressMathematical Biosciences and Engineering1551-00182020-01-011731922193910.3934/mbe.2020101Rating news claims: Feature selection and evaluationIzzat Alsmadi 0Michael J. O'Brien11. Department of Computing and Cyber Security, Texas A&M University–San Antonio, San Antonio, Texas 78224, USA2. Office of the Provost, Texas A&M University–San Antonio, San Antonio, Texas 78224, USANews claims that travel the Internet and online social networks (OSNs) originate from different, sometimes unknown sources, which raises issues related to the credibility of those claims and the drivers behind them. Fact-checking websites such as Snopes, FactCheck, and Emergent use human evaluators to investigate and label news claims, but the process is labor- and time-intensive. Driven by the need to use data analytics and algorithms in assessing the credibility of news claims, we focus on what can be generalized about evaluating human-labeled claims. We developed tools to extract claims from Snopes and Emergent and used public datasets collected by and published on those websites. Claims extracted from those datasets were supervised or labeled with different claim ratings. We focus on claims with definite ratings—false, mostly false, true, and mostly true, with the goal of identifying distinctive features that can be used to distinguish true from false claims. Ultimately, those features can be used to predict future unsupervised or unlabeled claims. We evaluate different methods to extract features as well as different sets of features and their ability to predict the correct claim label. By far, we noticed that OSN websites report high rates of false claims in comparison with most of the other website categories. The rate of reported false claims is higher than the rate of true claims in fact-checking websites in most categories. At the content-analysis level, false claims tend to have more negative tones in sentiments and hence can provide supporting features to predict claim classification.https://www.aimspress.com/article/doi/10.3934/mbe.2020101?viewType=HTMLfeature extractioninformation credibilityonline social networkspredictive models
spellingShingle Izzat Alsmadi
Michael J. O'Brien
Rating news claims: Feature selection and evaluation
Mathematical Biosciences and Engineering
feature extraction
information credibility
online social networks
predictive models
title Rating news claims: Feature selection and evaluation
title_full Rating news claims: Feature selection and evaluation
title_fullStr Rating news claims: Feature selection and evaluation
title_full_unstemmed Rating news claims: Feature selection and evaluation
title_short Rating news claims: Feature selection and evaluation
title_sort rating news claims feature selection and evaluation
topic feature extraction
information credibility
online social networks
predictive models
url https://www.aimspress.com/article/doi/10.3934/mbe.2020101?viewType=HTML
work_keys_str_mv AT izzatalsmadi ratingnewsclaimsfeatureselectionandevaluation
AT michaeljobrien ratingnewsclaimsfeatureselectionandevaluation