Improving LLM Long Context Understanding via Synthetic Data and Adaptive Compression

Recent innovations in large language models (LLMs) have led to their widespread use, but the long context problem remains a fundamental challenge. Transformer-based LLMs are constrained by the quadratic scaling of the self-attention mechanism, which restricts most popular LLMs to a context length of...

Full description

Bibliographic Details
Main Author: Li, Jerry
Other Authors: Feris, Rogerio
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156754