Measuring and Manipulating State Representations in Neural Language Models

Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance...

Full description

Bibliographic Details
Main Author:	Li, Belinda Zou
Other Authors:	Andreas, Jacob
Format:	Thesis
Published:	Massachusetts Institute of Technology 2023
Online Access:	https://hdl.handle.net/1721.1/150114

_version_	1826215304346206208
author	Li, Belinda Zou
author2	Andreas, Jacob
author_facet	Andreas, Jacob Li, Belinda Zou
author_sort	Li, Belinda Zou
collection	MIT
description	Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance on many language reasoning benchmarks. However, sentences generated by LMs often still suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. One account of the successes and failures of LM generation states that LMs are simply modeling surface word co-occurrence statistics. However, we provide evidence for an alternative account (not mutually exclusive with the first): LMs represent and reason about the world they describe. In BART and T5 transformer LMs, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity’s current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained LMs is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Consequently, when LMs fail generate coherent text, such failure can be attributable to either errors in inferring state from context or errors in generating next sentences consistent with the inferred state. We describe a procedure for distinguishing these two types of errors. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000–9,000 training examples with only 500 state annotations.
first_indexed	2024-09-23T16:23:58Z
format	Thesis
id	mit-1721.1/150114
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T16:23:58Z
publishDate	2023
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1501142023-04-01T03:04:13Z Measuring and Manipulating State Representations in Neural Language Models Li, Belinda Zou Andreas, Jacob Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance on many language reasoning benchmarks. However, sentences generated by LMs often still suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. One account of the successes and failures of LM generation states that LMs are simply modeling surface word co-occurrence statistics. However, we provide evidence for an alternative account (not mutually exclusive with the first): LMs represent and reason about the world they describe. In BART and T5 transformer LMs, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity’s current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained LMs is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Consequently, when LMs fail generate coherent text, such failure can be attributable to either errors in inferring state from context or errors in generating next sentences consistent with the inferred state. We describe a procedure for distinguishing these two types of errors. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000–9,000 training examples with only 500 state annotations. S.M. 2023-03-31T14:33:18Z 2023-03-31T14:33:18Z 2023-02 2023-02-28T14:36:10.920Z Thesis https://hdl.handle.net/1721.1/150114 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Li, Belinda Zou Measuring and Manipulating State Representations in Neural Language Models
title	Measuring and Manipulating State Representations in Neural Language Models
title_full	Measuring and Manipulating State Representations in Neural Language Models
title_fullStr	Measuring and Manipulating State Representations in Neural Language Models
title_full_unstemmed	Measuring and Manipulating State Representations in Neural Language Models
title_short	Measuring and Manipulating State Representations in Neural Language Models
title_sort	measuring and manipulating state representations in neural language models
url	https://hdl.handle.net/1721.1/150114
work_keys_str_mv	AT libelindazou measuringandmanipulatingstaterepresentationsinneurallanguagemodels

Measuring and Manipulating State Representations in Neural Language Models

Similar Items