Load balancing and memory optimizations for expert parallel training of large language models

Large language models (LLMs) are an effective way to solve many text-based machine learning tasks, but require huge amounts of computation to train and evaluate. Mixture of experts models have emerged as a way to reduce the amount of computation required for LLMs without compromising accuracy. It is...

Full description

Bibliographic Details
Main Author:	Wisdom, Daniel
Other Authors:	Leiserson, Charles E.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2024
Online Access:	https://hdl.handle.net/1721.1/153897

Internet

https://hdl.handle.net/1721.1/153897

Load balancing and memory optimizations for expert parallel training of large language models

Internet

Similar Items