Measuring and Manipulating State Representations in Neural Language Models
Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2023
|
Online Access: | https://hdl.handle.net/1721.1/150114 |
_version_ | 1826215304346206208 |
---|---|
author | Li, Belinda Zou |
author2 | Andreas, Jacob |
author_facet | Andreas, Jacob Li, Belinda Zou |
author_sort | Li, Belinda Zou |
collection | MIT |
description | Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance on many language reasoning benchmarks. However, sentences generated by LMs often still suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. One account of the successes and failures of LM generation states that LMs are simply modeling surface word co-occurrence statistics. However, we provide evidence for an alternative account (not mutually exclusive with the first): LMs represent and reason about the world they describe. In BART and T5 transformer LMs, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity’s current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained LMs is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Consequently, when LMs fail generate coherent text, such failure can be attributable to either errors in inferring state from context or errors in generating next sentences consistent with the inferred state. We describe a procedure for distinguishing these two types of errors. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000–9,000 training examples with only 500 state annotations. |
first_indexed | 2024-09-23T16:23:58Z |
format | Thesis |
id | mit-1721.1/150114 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T16:23:58Z |
publishDate | 2023 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1501142023-04-01T03:04:13Z Measuring and Manipulating State Representations in Neural Language Models Li, Belinda Zou Andreas, Jacob Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Modern neural language models (LMs) are typically pre-trained with a self-supervised objective: they are presented with texts that have piece(s) withheld, and asked to generate the withheld portions of the text. By simply scaling up such training, LMs have been able to achieve remarkable performance on many language reasoning benchmarks. However, sentences generated by LMs often still suffer from coherence errors: they describe events and situations inconsistent with the state of the world described by preceding text. One account of the successes and failures of LM generation states that LMs are simply modeling surface word co-occurrence statistics. However, we provide evidence for an alternative account (not mutually exclusive with the first): LMs represent and reason about the world they describe. In BART and T5 transformer LMs, we identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse. These neural representations have functional similarities to linguistic models of dynamic semantics: they support a linear readout of each entity’s current properties and relations, and can be manipulated with predictable effects on language generation. Our results indicate that prediction in pretrained LMs is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state, and that this behavior can be learned with only text as training data. Consequently, when LMs fail generate coherent text, such failure can be attributable to either errors in inferring state from context or errors in generating next sentences consistent with the inferred state. We describe a procedure for distinguishing these two types of errors. In models with correctable errors of the first type, we show that targeted supervision can address them. We introduce two procedures for using explicit representations of world state as auxiliary supervision. These procedures efficiently improve LM coherence, in some cases providing the benefits of 1,000–9,000 training examples with only 500 state annotations. S.M. 2023-03-31T14:33:18Z 2023-03-31T14:33:18Z 2023-02 2023-02-28T14:36:10.920Z Thesis https://hdl.handle.net/1721.1/150114 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Li, Belinda Zou Measuring and Manipulating State Representations in Neural Language Models |
title | Measuring and Manipulating State Representations in Neural Language Models |
title_full | Measuring and Manipulating State Representations in Neural Language Models |
title_fullStr | Measuring and Manipulating State Representations in Neural Language Models |
title_full_unstemmed | Measuring and Manipulating State Representations in Neural Language Models |
title_short | Measuring and Manipulating State Representations in Neural Language Models |
title_sort | measuring and manipulating state representations in neural language models |
url | https://hdl.handle.net/1721.1/150114 |
work_keys_str_mv | AT libelindazou measuringandmanipulatingstaterepresentationsinneurallanguagemodels |