Transfer learning and robustness for natural language processing

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020

Bibliographic Details
Main Author:	Jin, Di,Ph.D.Massachusetts Institute of Technology.
Other Authors:	Peter Szolovits.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2021
Subjects:	Mechanical Engineering.
Online Access:	https://hdl.handle.net/1721.1/129004

_version_	1826206767080538112
author	Jin, Di,Ph.D.Massachusetts Institute of Technology.
author2	Peter Szolovits.
author_facet	Peter Szolovits. Jin, Di,Ph.D.Massachusetts Institute of Technology.
author_sort	Jin, Di,Ph.D.Massachusetts Institute of Technology.
collection	MIT
description	Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020
first_indexed	2024-09-23T13:37:56Z
format	Thesis
id	mit-1721.1/129004
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T13:37:56Z
publishDate	2021
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1290042021-01-06T03:17:57Z Transfer learning and robustness for natural language processing Jin, Di,Ph.D.Massachusetts Institute of Technology. Peter Szolovits. Massachusetts Institute of Technology. Department of Mechanical Engineering. Massachusetts Institute of Technology. Department of Mechanical Engineering Mechanical Engineering. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2020 Cataloged from student-submitted PDF of thesis. Includes bibliographical references (pages 189-217). Teaching machines to understand human language is one of the most elusive and long-standing challenges in Natural Language Processing (NLP). Driven by the fast development of deep learning, state-of-the-art NLP models have already achieved human-level performance in various large benchmark datasets, such as SQuAD, SNLI, and RACE. However, when these strong models are deployed to real-world applications, they often show poor generalization capability in two situations: 1. There is only a limited amount of data available for model training; 2. Deployed models may degrade significantly in performance on noisy test data or natural/artificial adversaries. In short, performance degradation on low-resource tasks/datasets and unseen data with distribution shifts imposes great challenges to the reliability of NLP models and prevent them from being massively applied in the wild. This dissertation aims to address these two issues. Towards the first one, we resort to transfer learning to leverage knowledge acquired from related data in order to improve performance on a target low-resource task/dataset. Specifically, we propose different transfer learning methods for three natural language understanding tasks: multi-choice question answering, dialogue state tracking, and sequence labeling, and one natural language generation task: machine translation. These methods are based on four basic transfer learning modalities: multi-task learning, sequential transfer learning, domain adaptation, and cross-lingual transfer. We show experimental results to validate that transferring knowledge from related domains, tasks, and languages can improve the target task/dataset significantly. For the second issue, we propose methods to evaluate the robustness of NLP models on text classification and entailment tasks. On one hand, we reveal that although these models can achieve a high accuracy of over 90%, they still easily crash over paraphrases of original samples by changing only around 10% words to their synonyms. On the other hand, by creating a new challenge set using four adversarial strategies, we find even the best models for the aspect-based sentiment analysis task cannot reliably identify the target aspect and recognize its sentiment accordingly. On the contrary, they are easily confused by distractor aspects. Overall, these findings raise great concerns of robustness of NLP models, which should be enhanced to ensure their long-run stable service. by Di Jin. Ph. D. Ph.D. Massachusetts Institute of Technology, Department of Mechanical Engineering 2021-01-05T23:12:27Z 2021-01-05T23:12:27Z 2020 2020 Thesis https://hdl.handle.net/1721.1/129004 1227042422 eng MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582 217 pages application/pdf Massachusetts Institute of Technology
spellingShingle	Mechanical Engineering. Jin, Di,Ph.D.Massachusetts Institute of Technology. Transfer learning and robustness for natural language processing
title	Transfer learning and robustness for natural language processing
title_full	Transfer learning and robustness for natural language processing
title_fullStr	Transfer learning and robustness for natural language processing
title_full_unstemmed	Transfer learning and robustness for natural language processing
title_short	Transfer learning and robustness for natural language processing
title_sort	transfer learning and robustness for natural language processing
topic	Mechanical Engineering.
url	https://hdl.handle.net/1721.1/129004
work_keys_str_mv	AT jindiphdmassachusettsinstituteoftechnology transferlearningandrobustnessfornaturallanguageprocessing

Transfer learning and robustness for natural language processing

Similar Items