Robust and Efficient Deep Learning for Misinformation Prevention
Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/140022 |
_version_ | 1826210016180305920 |
---|---|
author | Schuster, Tal |
author2 | Barzilay, Regina |
author_facet | Barzilay, Regina Schuster, Tal |
author_sort | Schuster, Tal |
collection | MIT |
description | Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users. In this thesis, we present novel methods to fight the proliferation of misinformation online. We focus on the task of automated fact verification where the veracity of a given claim is examined against external reliable sources. We analyze the desired specifications of fact verification systems and describe the need for efficiency when operating against large comprehensive free text information resources, while ensuring robustness to challenging inputs and sensitivity to modifications in the referenced evidence. Our methods are general and, as we demonstrate, improve the robustness, efficiency, and interpretability of many other models beyond fact verification.
In the first part of this thesis, we focus on the robustness, sensitivity, and interpretability of sentence-pair classifiers. We present methodologies for identifying and quantifying idiosyncrasies in large curated datasets that undesirably lead models to rely on nongeneralizable statistical cues. We demonstrate how contrastive evidence pairs can alleviate this issue by enforcing models to perform sentence-pair inference. To obtain such examples automatically, we develop a novel rationale-based denoising pipeline for modifying refuting evidence to agree with a given claim. In addition, we present a semi-automated solution for creating contrastive pairs from Wikipedia revisions and share a new large dataset.
In the second part, we turn to improve the inference efficiency of both the evidence retrieval and the claim classification modules, while reliably controlling their accuracy. We introduce new confidence measures and develop novel extensions to the conformal prediction framework. Our methods can dynamically allocate the required computational resources for each input to satisfy an arbitrary user-specified tolerance level. We demonstrate on multiple datasets that our well-calibrated decision rules reliably provide significant efficiency gains. |
first_indexed | 2024-09-23T14:39:51Z |
format | Thesis |
id | mit-1721.1/140022 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T14:39:51Z |
publishDate | 2022 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1400222022-10-06T04:03:25Z Robust and Efficient Deep Learning for Misinformation Prevention Schuster, Tal Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Deep learning models have recently revolutionized the online environment, opening up many exciting opportunities to improve the user experience. These models, however, also introduce new threats by possibly creating or promoting misinformation, either intentionally or deliberately by malicious users. In this thesis, we present novel methods to fight the proliferation of misinformation online. We focus on the task of automated fact verification where the veracity of a given claim is examined against external reliable sources. We analyze the desired specifications of fact verification systems and describe the need for efficiency when operating against large comprehensive free text information resources, while ensuring robustness to challenging inputs and sensitivity to modifications in the referenced evidence. Our methods are general and, as we demonstrate, improve the robustness, efficiency, and interpretability of many other models beyond fact verification. In the first part of this thesis, we focus on the robustness, sensitivity, and interpretability of sentence-pair classifiers. We present methodologies for identifying and quantifying idiosyncrasies in large curated datasets that undesirably lead models to rely on nongeneralizable statistical cues. We demonstrate how contrastive evidence pairs can alleviate this issue by enforcing models to perform sentence-pair inference. To obtain such examples automatically, we develop a novel rationale-based denoising pipeline for modifying refuting evidence to agree with a given claim. In addition, we present a semi-automated solution for creating contrastive pairs from Wikipedia revisions and share a new large dataset. In the second part, we turn to improve the inference efficiency of both the evidence retrieval and the claim classification modules, while reliably controlling their accuracy. We introduce new confidence measures and develop novel extensions to the conformal prediction framework. Our methods can dynamically allocate the required computational resources for each input to satisfy an arbitrary user-specified tolerance level. We demonstrate on multiple datasets that our well-calibrated decision rules reliably provide significant efficiency gains. Ph.D. 2022-02-07T15:19:30Z 2022-02-07T15:19:30Z 2021-09 2021-09-21T19:31:08.717Z Thesis https://hdl.handle.net/1721.1/140022 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Schuster, Tal Robust and Efficient Deep Learning for Misinformation Prevention |
title | Robust and Efficient Deep Learning for Misinformation Prevention |
title_full | Robust and Efficient Deep Learning for Misinformation Prevention |
title_fullStr | Robust and Efficient Deep Learning for Misinformation Prevention |
title_full_unstemmed | Robust and Efficient Deep Learning for Misinformation Prevention |
title_short | Robust and Efficient Deep Learning for Misinformation Prevention |
title_sort | robust and efficient deep learning for misinformation prevention |
url | https://hdl.handle.net/1721.1/140022 |
work_keys_str_mv | AT schustertal robustandefficientdeeplearningformisinformationprevention |