Efficient Deployment Algorithms for Large Language Models

Large language models (LLMs) have achieved impressive performance on various natural language tasks. However, their massive computational and memory requirements hinder widespread deployment. Additionally, deploying them on extensive inputs presents efficiency and accuracy challenges. This proposal...

Full description

Bibliographic Details
Main Author: Xiao, Guangxuan
Other Authors: Han, Song
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156332
https://orcid.org/0000-0002-7182-9284