Towards a Deeper Understanding of Neural Language Generation

In recent years, the field of language modelling has witnessed exciting developments. Especially, thanks to large-scale data, powerful model architectures, and high-speed parallel computing devices, researchers are able to train language models which can generate realistic text. However, our underst...

Full description

Bibliographic Details
Main Author:	He, Tianxing
Other Authors:	Glass, James R.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/144922

_version_	1826194193406492672
author	He, Tianxing
author2	Glass, James R.
author_facet	Glass, James R. He, Tianxing
author_sort	He, Tianxing
collection	MIT
description	In recent years, the field of language modelling has witnessed exciting developments. Especially, thanks to large-scale data, powerful model architectures, and high-speed parallel computing devices, researchers are able to train language models which can generate realistic text. However, our understanding of these powerful language models remains shallow. What aspects of the language model are good, and what aspects need to be improved? These will be the key questions behind this thesis. This thesis includes a set of behavior analyses of language models (LMs) with a focus on generation. We will also propose methods to alleviate some of the identified problems. The four high-level topics are (1) The general sampling behavior of an auto-regressive LM. In particular, we will take a closer look at the popular sampling algorithms. (2) Whether the LM is vulnerable to adversarial attacks, and how to make it more robust. (3) The LM’s ability to remember knowledge learned from data, and relatedly, what’s the best way to expose this learned knowledge. (4) How to get more fine-grained control on the model’s generation.
first_indexed	2024-09-23T09:52:23Z
format	Thesis
id	mit-1721.1/144922
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T09:52:23Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1449222022-08-30T03:53:22Z Towards a Deeper Understanding of Neural Language Generation He, Tianxing Glass, James R. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science In recent years, the field of language modelling has witnessed exciting developments. Especially, thanks to large-scale data, powerful model architectures, and high-speed parallel computing devices, researchers are able to train language models which can generate realistic text. However, our understanding of these powerful language models remains shallow. What aspects of the language model are good, and what aspects need to be improved? These will be the key questions behind this thesis. This thesis includes a set of behavior analyses of language models (LMs) with a focus on generation. We will also propose methods to alleviate some of the identified problems. The four high-level topics are (1) The general sampling behavior of an auto-regressive LM. In particular, we will take a closer look at the popular sampling algorithms. (2) Whether the LM is vulnerable to adversarial attacks, and how to make it more robust. (3) The LM’s ability to remember knowledge learned from data, and relatedly, what’s the best way to expose this learned knowledge. (4) How to get more fine-grained control on the model’s generation. Ph.D. 2022-08-29T16:21:10Z 2022-08-29T16:21:10Z 2022-05 2022-06-21T19:15:42.506Z Thesis https://hdl.handle.net/1721.1/144922 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	He, Tianxing Towards a Deeper Understanding of Neural Language Generation
title	Towards a Deeper Understanding of Neural Language Generation
title_full	Towards a Deeper Understanding of Neural Language Generation
title_fullStr	Towards a Deeper Understanding of Neural Language Generation
title_full_unstemmed	Towards a Deeper Understanding of Neural Language Generation
title_short	Towards a Deeper Understanding of Neural Language Generation
title_sort	towards a deeper understanding of neural language generation
url	https://hdl.handle.net/1721.1/144922
work_keys_str_mv	AT hetianxing towardsadeeperunderstandingofneurallanguagegeneration

Towards a Deeper Understanding of Neural Language Generation

Similar Items