Robust and Efficient Deep Learning for Misinformation Prevention

Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users...

Full description

Bibliographic Details
Main Author:	Schuster, Tal
Other Authors:	Barzilay, Regina
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/140022

_version_	1826210016180305920
author	Schuster, Tal
author2	Barzilay, Regina
author_facet	Barzilay, Regina Schuster, Tal
author_sort	Schuster, Tal
collection	MIT
description	Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users. In this thesis, we present novel methods to fight the proliferation of misinformation online. We focus on the task of automated fact verification where the veracity of a given claim is examined against external reliable sources. We analyze the desired specifications of fact verification systems and describe the need for efficiency when operating against large comprehensive free text information resources, while ensuring robustness to challenging inputs and sensitivity to modifications in the referenced evidence. Our methods are general and, as we demonstrate, improve the robustness, efficiency, and interpretability of many other models beyond fact verification. In the first part of this thesis, we focus on the robustness, sensitivity, and interpretability of sentence-pair classifiers. We present methodologies for identifying and quantifying idiosyncrasies in large curated datasets that undesirably lead models to rely on nongeneralizable statistical cues. We demonstrate how contrastive evidence pairs can alleviate this issue by enforcing models to perform sentence-pair inference. To obtain such examples automatically, we develop a novel rationale-based denoising pipeline for modifying refuting evidence to agree with a given claim. In addition, we present a semi-automated solution for creating contrastive pairs from Wikipedia revisions and share a new large dataset. In the second part, we turn to improve the inference efficiency of both the evidence retrieval and the claim classification modules, while reliably controlling their accuracy. We introduce new confidence measures and develop novel extensions to the conformal prediction framework. Our methods can dynamically allocate the required computational resources for each input to satisfy an arbitrary user-specified tolerance level. We demonstrate on multiple datasets that our well-calibrated decision rules reliably provide significant efficiency gains.
first_indexed	2024-09-23T14:39:51Z
format	Thesis
id	mit-1721.1/140022
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T14:39:51Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1400222022-10-06T04:03:25Z Robust and Efficient Deep Learning for Misinformation Prevention Schuster, Tal Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users. In this thesis, we present novel methods to fight the proliferation of misinformation online. We focus on the task of automated fact verification where the veracity of a given claim is examined against external reliable sources. We analyze the desired specifications of fact verification systems and describe the need for efficiency when operating against large comprehensive free text information resources, while ensuring robustness to challenging inputs and sensitivity to modifications in the referenced evidence. Our methods are general and, as we demonstrate, improve the robustness, efficiency, and interpretability of many other models beyond fact verification. In the first part of this thesis, we focus on the robustness, sensitivity, and interpretability of sentence-pair classifiers. We present methodologies for identifying and quantifying idiosyncrasies in large curated datasets that undesirably lead models to rely on nongeneralizable statistical cues. We demonstrate how contrastive evidence pairs can alleviate this issue by enforcing models to perform sentence-pair inference. To obtain such examples automatically, we develop a novel rationale-based denoising pipeline for modifying refuting evidence to agree with a given claim. In addition, we present a semi-automated solution for creating contrastive pairs from Wikipedia revisions and share a new large dataset. In the second part, we turn to improve the inference efficiency of both the evidence retrieval and the claim classification modules, while reliably controlling their accuracy. We introduce new confidence measures and develop novel extensions to the conformal prediction framework. Our methods can dynamically allocate the required computational resources for each input to satisfy an arbitrary user-specified tolerance level. We demonstrate on multiple datasets that our well-calibrated decision rules reliably provide significant efficiency gains. Ph.D. 2022-02-07T15:19:30Z 2022-02-07T15:19:30Z 2021-09 2021-09-21T19:31:08.717Z Thesis https://hdl.handle.net/1721.1/140022 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Schuster, Tal Robust and Efficient Deep Learning for Misinformation Prevention
title	Robust and Efficient Deep Learning for Misinformation Prevention
title_full	Robust and Efficient Deep Learning for Misinformation Prevention
title_fullStr	Robust and Efficient Deep Learning for Misinformation Prevention
title_full_unstemmed	Robust and Efficient Deep Learning for Misinformation Prevention
title_short	Robust and Efficient Deep Learning for Misinformation Prevention
title_sort	robust and efficient deep learning for misinformation prevention
url	https://hdl.handle.net/1721.1/140022
work_keys_str_mv	AT schustertal robustandefficientdeeplearningformisinformationprevention

Robust and Efficient Deep Learning for Misinformation Prevention

Similar Items