Interpreting and Editing Memory in Large Transformer Language Models

Interpreting and Editing Memory in Large Transformer Language Models

This thesis investigates the mechanisms of factual recall in large language models. We first apply causal interventions to identify neuron activations that are decisive in a model’s factual predictions; surprisingly, we find that factual recall corresponds to a sparse, localizable computation in the...

Full description

Bibliographic Details
Main Author:	Meng, Kevin
Other Authors:	Andreas, Jacob D.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2024
Online Access:	https://hdl.handle.net/1721.1/156794

Similar Items

Transforming experiences: Neurobiology of memory updating/editing
by: Daniel Osorio-Gómez, et al.
Published: (2023-02-01)

Instruction-guided image editing empowered by large language models
by: Wang, Yiying
Published: (2024)

Augmenting interpretable models with large language models during training
by: Chandan Singh, et al.
Published: (2023-11-01)

Leave It to Large Language Models! Correction and Planning with Memory Integration
by: Yuan Zhang, et al.
Published: (2024-01-01)

Truthfulness in Large Language Models
by: Liu, Kevin
Published: (2023)

A Novel Memory Concurrent Editing Model for Large-Scale Video Streams in Edge Computing
by: Haitao Liu, et al.
Published: (2023-07-01)

Transformation of C Programming Language Memory Model into Object-Oriented Representation of EO Language
by: Alexander I. Legalov, et al.
Published: (2022-09-01)

Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine
by: Thomas Savage, et al.
Published: (2024-01-01)

Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models
by: Emily Alsentzer, et al.
Published: (2023-11-01)

Memory editing mechanisms /
by: Lampinen, James Michael, et al.
Published: (2006)

Interpretable vector language models
by: Eng, Jing Keat
Published: (2023)

Load balancing and memory optimizations for expert parallel training of large language models
by: Wisdom, Daniel
Published: (2024)

Speech errors in consecutive interpreting: Effects of language proficiency, working memory, and anxiety.
by: Nan Zhao, et al.
Published: (2023-01-01)

The inevitable transformation of medicine and research by large language models: The possibilities and pitfalls
by: Yuanxu Gao, et al.
Published: (2023-06-01)

Interpretation as Introspection: Transforming Narratives of American Art at the Allen Memorial Art Museum
by: Hannah Wirta Kinney, et al.
Published: (2023-06-01)

Abstract interpretation of domain-specific embedded languages
by: Backhouse, K, et al.
Published: (2002)

On Learning Interpreted Languages with Recurrent Models
by: Denis Paperno
Published: (2022-01-01)

Development of Transformation for Genome Editing of an Emerging Model Organism
by: Yutaka Yamamoto, et al.
Published: (2022-06-01)

Changing Interpretations of Otherness in English-Language Accounts of Japanese Architecture
by: Kevin Nute
Published: (2019-09-01)

Editing and interpreting Ovid’s Fasti: text, date, form
by: Heyworth, S
Published: (2018)

Map reading and interpretation : new edition with metric examples /
by: 224166 Speak, P., et al.
Published: (1964)

Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Interpretive Large Language Model-Based Materials Design
by: Markus J. Buehler
Published: (2024-01-01)

Interpretable neural models for natural language processing
by: Lei, Tao, Ph. D. Massachusetts Institute of Technology
Published: (2017)

Transformation of Rasch model logits for enhanced interpretability
by: Joakim Ekstrand, et al.
Published: (2022-12-01)

Expression and interpretation in language /
by: 566580 Petrilli, Susan
Published: (2012)

Validation of a Deep Learning Chest X-ray Interpretation Model: Integrating Large-Scale AI and Large Language Models for Comparative Analysis with ChatGPT
by: Kyu Hong Lee, et al.
Published: (2023-12-01)

Memoro: Using Large Language Models to Realize a Concise Interface for Real-Time Memory Augmentation
by: Zulfikar, Wazeer Deen, et al.
Published: (2024)

Forgotten Memory of Numerous Synchronous Grammar Editions?
by: Mojca Smolej
Published: (2019-08-01)

The potential of Large Language Models in language education
by: Vita A. Hamaniuk
Published: (2021-12-01)

Large Language Models are Not Models of Natural Language: They are Corpus Models
by: Csaba Veres
Published: (2022-01-01)

TRAINING OF THE FUTURE INTERPRETERS’ WORKING MEMORY
by: Antonina V. Prokopenko, et al.
Published: (2021-12-01)

Impact of Memory on Consecutive Interpretation Quality
by: Bahareh Taherian, et al.
Published: (2021-06-01)

Deep language models for interpretative and predictive materials science
by: Yiwen Hu, et al.
Published: (2023-03-01)

Controllable and Editable Neural Story Plot Generation via Control-and-Edit Transformer
by: Jin Chen, et al.
Published: (2021-01-01)

Abstract Interpretation for Declarative Languages
Published: (1987)

The Process of interpreting in the Arabic Language
by: hamidreza ,irhaji
Published: (2006-09-01)

Language, Giving-the-Meaning and Interpretation
by: Ilyas Altuner
Published: (2021-05-01)

Natural language processing in the era of large language models
by: Arkaitz Zubiaga
Published: (2024-01-01)

Accelerating materials language processing with large language models
by: Jaewoong Choi, et al.
Published: (2024-02-01)

Vaidya Antarkar Memorial Volume: Edited by Ashwinikumar Raut
by: Namyata Pathak
Published: (2011-01-01)