Robust and Efficient Deep Learning for Misinformation Prevention

Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users...

Full description

Bibliographic Details
Main Author: Schuster, Tal
Other Authors: Barzilay, Regina
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/140022
_version_ 1826210016180305920
author Schuster, Tal
author2 Barzilay, Regina
author_facet Barzilay, Regina
Schuster, Tal
author_sort Schuster, Tal
collection MIT
description Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users. In this thesis, we present novel methods to fight the proliferation of misinformation online. We focus on the task of automated fact verification where the veracity of a given claim is examined against external reliable sources. We analyze the desired specifications of fact verification systems and describe the need for efficiency when operating against large comprehensive free text information resources, while ensuring robustness to challenging inputs and sensitivity to modifications in the referenced evidence. Our methods are general and, as we demonstrate, improve the robustness, efficiency, and interpretability of many other models beyond fact verification. In the first part of this thesis, we focus on the robustness, sensitivity, and interpretability of sentence-pair classifiers. We present methodologies for identifying and quantifying idiosyncrasies in large curated datasets that undesirably lead models to rely on nongeneralizable statistical cues. We demonstrate how contrastive evidence pairs can alleviate this issue by enforcing models to perform sentence-pair inference. To obtain such examples automatically, we develop a novel rationale-based denoising pipeline for modifying refuting evidence to agree with a given claim. In addition, we present a semi-automated solution for creating contrastive pairs from Wikipedia revisions and share a new large dataset. In the second part, we turn to improve the inference efficiency of both the evidence retrieval and the claim classification modules, while reliably controlling their accuracy. We introduce new confidence measures and develop novel extensions to the conformal prediction framework. Our methods can dynamically allocate the required computational resources for each input to satisfy an arbitrary user-specified tolerance level. We demonstrate on multiple datasets that our well-calibrated decision rules reliably provide significant efficiency gains.
first_indexed 2024-09-23T14:39:51Z
format Thesis
id mit-1721.1/140022
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:39:51Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1400222022-10-06T04:03:25Z Robust and Efficient Deep Learning for Misinformation Prevention Schuster, Tal Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users. In this thesis, we present novel methods to fight the proliferation of misinformation online. We focus on the task of automated fact verification where the veracity of a given claim is examined against external reliable sources. We analyze the desired specifications of fact verification systems and describe the need for efficiency when operating against large comprehensive free text information resources, while ensuring robustness to challenging inputs and sensitivity to modifications in the referenced evidence. Our methods are general and, as we demonstrate, improve the robustness, efficiency, and interpretability of many other models beyond fact verification. In the first part of this thesis, we focus on the robustness, sensitivity, and interpretability of sentence-pair classifiers. We present methodologies for identifying and quantifying idiosyncrasies in large curated datasets that undesirably lead models to rely on nongeneralizable statistical cues. We demonstrate how contrastive evidence pairs can alleviate this issue by enforcing models to perform sentence-pair inference. To obtain such examples automatically, we develop a novel rationale-based denoising pipeline for modifying refuting evidence to agree with a given claim. In addition, we present a semi-automated solution for creating contrastive pairs from Wikipedia revisions and share a new large dataset. In the second part, we turn to improve the inference efficiency of both the evidence retrieval and the claim classification modules, while reliably controlling their accuracy. We introduce new confidence measures and develop novel extensions to the conformal prediction framework. Our methods can dynamically allocate the required computational resources for each input to satisfy an arbitrary user-specified tolerance level. We demonstrate on multiple datasets that our well-calibrated decision rules reliably provide significant efficiency gains. Ph.D. 2022-02-07T15:19:30Z 2022-02-07T15:19:30Z 2021-09 2021-09-21T19:31:08.717Z Thesis https://hdl.handle.net/1721.1/140022 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Schuster, Tal
Robust and Efficient Deep Learning for Misinformation Prevention
title Robust and Efficient Deep Learning for Misinformation Prevention
title_full Robust and Efficient Deep Learning for Misinformation Prevention
title_fullStr Robust and Efficient Deep Learning for Misinformation Prevention
title_full_unstemmed Robust and Efficient Deep Learning for Misinformation Prevention
title_short Robust and Efficient Deep Learning for Misinformation Prevention
title_sort robust and efficient deep learning for misinformation prevention
url https://hdl.handle.net/1721.1/140022
work_keys_str_mv AT schustertal robustandefficientdeeplearningformisinformationprevention