Improving LLM Long Context Understanding via Synthetic Data and Adaptive Compression
Recent innovations in large language models (LLMs) have led to their widespread use, but the long context problem remains a fundamental challenge. Transformer-based LLMs are constrained by the quadratic scaling of the self-attention mechanism, which restricts most popular LLMs to a context length of...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/156754 |