Transfer learning and robustness for natural language processing

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020

Bibliographic Details
Main Author: Jin, Di,Ph.D.Massachusetts Institute of Technology.
Other Authors: Peter Szolovits.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2021
Subjects:
Online Access:https://hdl.handle.net/1721.1/129004
_version_ 1826206767080538112
author Jin, Di,Ph.D.Massachusetts Institute of Technology.
author2 Peter Szolovits.
author_facet Peter Szolovits.
Jin, Di,Ph.D.Massachusetts Institute of Technology.
author_sort Jin, Di,Ph.D.Massachusetts Institute of Technology.
collection MIT
description Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020
first_indexed 2024-09-23T13:37:56Z
format Thesis
id mit-1721.1/129004
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T13:37:56Z
publishDate 2021
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1290042021-01-06T03:17:57Z Transfer learning and robustness for natural language processing Jin, Di,Ph.D.Massachusetts Institute of Technology. Peter Szolovits. Massachusetts Institute of Technology. Department of Mechanical Engineering. Massachusetts Institute of Technology. Department of Mechanical Engineering Mechanical Engineering. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020 Cataloged from student-submitted PDF of thesis. Includes bibliographical references (pages 189-217). Teaching machines to understand human language is one of the most elusive and long-standing challenges in Natural Language Processing (NLP). Driven by the fast development of deep learning, state-of-the-art NLP models have already achieved human-level performance in various large benchmark datasets, such as SQuAD, SNLI, and RACE. However, when these strong models are deployed to real-world applications, they often show poor generalization capability in two situations: 1. There is only a limited amount of data available for model training; 2. Deployed models may degrade significantly in performance on noisy test data or natural/artificial adversaries. In short, performance degradation on low-resource tasks/datasets and unseen data with distribution shifts imposes great challenges to the reliability of NLP models and prevent them from being massively applied in the wild. This dissertation aims to address these two issues. Towards the first one, we resort to transfer learning to leverage knowledge acquired from related data in order to improve performance on a target low-resource task/dataset. Specifically, we propose different transfer learning methods for three natural language understanding tasks: multi-choice question answering, dialogue state tracking, and sequence labeling, and one natural language generation task: machine translation. These methods are based on four basic transfer learning modalities: multi-task learning, sequential transfer learning, domain adaptation, and cross-lingual transfer. We show experimental results to validate that transferring knowledge from related domains, tasks, and languages can improve the target task/dataset significantly. For the second issue, we propose methods to evaluate the robustness of NLP models on text classification and entailment tasks. On one hand, we reveal that although these models can achieve a high accuracy of over 90%, they still easily crash over paraphrases of original samples by changing only around 10% words to their synonyms. On the other hand, by creating a new challenge set using four adversarial strategies, we find even the best models for the aspect-based sentiment analysis task cannot reliably identify the target aspect and recognize its sentiment accordingly. On the contrary, they are easily confused by distractor aspects. Overall, these findings raise great concerns of robustness of NLP models, which should be enhanced to ensure their long-run stable service. by Di Jin. Ph. D. Ph.D. Massachusetts Institute of Technology, Department of Mechanical Engineering 2021-01-05T23:12:27Z 2021-01-05T23:12:27Z 2020 2020 Thesis https://hdl.handle.net/1721.1/129004 1227042422 eng MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582 217 pages application/pdf Massachusetts Institute of Technology
spellingShingle Mechanical Engineering.
Jin, Di,Ph.D.Massachusetts Institute of Technology.
Transfer learning and robustness for natural language processing
title Transfer learning and robustness for natural language processing
title_full Transfer learning and robustness for natural language processing
title_fullStr Transfer learning and robustness for natural language processing
title_full_unstemmed Transfer learning and robustness for natural language processing
title_short Transfer learning and robustness for natural language processing
title_sort transfer learning and robustness for natural language processing
topic Mechanical Engineering.
url https://hdl.handle.net/1721.1/129004
work_keys_str_mv AT jindiphdmassachusettsinstituteoftechnology transferlearningandrobustnessfornaturallanguageprocessing