Measuring and Manipulating State Representations in Neural Language Models

Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance...

Full description

Bibliographic Details
Main Author: Li, Belinda Zou
Other Authors: Andreas, Jacob
Format: Thesis
Published: Massachusetts Institute of Technology 2023
Online Access:https://hdl.handle.net/1721.1/150114
_version_ 1826215304346206208
author Li, Belinda Zou
author2 Andreas, Jacob
author_facet Andreas, Jacob
Li, Belinda Zou
author_sort Li, Belinda Zou
collection MIT
description Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance on many language reasoning benchmarks. However, sentences generated by LMs often still suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. One account of the successes and failures of LM generation states that LMs are simply modeling surface word co-occurrence statistics. However, we provide evidence for an alternative account (not mutually exclusive with the first): LMs represent and reason about the world they describe. In BART and T5 transformer LMs, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity’s current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained LMs is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Consequently, when LMs fail generate coherent text, such failure can be attributable to either errors in inferring state from context or errors in generating next sentences consistent with the inferred state. We describe a procedure for distinguishing these two types of errors. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000–9,000 training examples with only 500 state annotations.
first_indexed 2024-09-23T16:23:58Z
format Thesis
id mit-1721.1/150114
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T16:23:58Z
publishDate 2023
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1501142023-04-01T03:04:13Z Measuring and Manipulating State Representations in Neural Language Models Li, Belinda Zou Andreas, Jacob Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance on many language reasoning benchmarks. However, sentences generated by LMs often still suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. One account of the successes and failures of LM generation states that LMs are simply modeling surface word co-occurrence statistics. However, we provide evidence for an alternative account (not mutually exclusive with the first): LMs represent and reason about the world they describe. In BART and T5 transformer LMs, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity’s current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained LMs is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Consequently, when LMs fail generate coherent text, such failure can be attributable to either errors in inferring state from context or errors in generating next sentences consistent with the inferred state. We describe a procedure for distinguishing these two types of errors. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000–9,000 training examples with only 500 state annotations. S.M. 2023-03-31T14:33:18Z 2023-03-31T14:33:18Z 2023-02 2023-02-28T14:36:10.920Z Thesis https://hdl.handle.net/1721.1/150114 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Li, Belinda Zou
Measuring and Manipulating State Representations in Neural Language Models
title Measuring and Manipulating State Representations in Neural Language Models
title_full Measuring and Manipulating State Representations in Neural Language Models
title_fullStr Measuring and Manipulating State Representations in Neural Language Models
title_full_unstemmed Measuring and Manipulating State Representations in Neural Language Models
title_short Measuring and Manipulating State Representations in Neural Language Models
title_sort measuring and manipulating state representations in neural language models
url https://hdl.handle.net/1721.1/150114
work_keys_str_mv AT libelindazou measuringandmanipulatingstaterepresentationsinneurallanguagemodels