Load balancing and memory optimizations for expert parallel training of large language models
Large language models (LLMs) are an effective way to solve many text-based machine learning tasks, but require huge amounts of computation to train and evaluate. Mixture of experts models have emerged as a way to reduce the amount of computation required for LLMs without compromising accuracy. It is...
Main Author: | Wisdom, Daniel |
---|---|
Other Authors: | Leiserson, Charles E. |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/153897 |
Similar Items
-
Effects of Parallel Processing Implementation on Balanced Load-Division Depending on Distributed Memory Systems
by: Subhi R. Zebari, et al.
Published: (2012-06-01) -
Deep Learning Compiler Load Balancing Optimization Method for Model Training
by: WANG Li, GAO Kai, ZHAO Yaqian, LI Rengang, CAO Fang, GUO Zhenhua
Published: (2024-01-01) -
Load Balancing Based on Firefly and Ant Colony Optimization Algorithms for Parallel Computing
by: Yong Li, et al.
Published: (2022-10-01) -
Tight bounds for parallel randomized load balancing
by: Lenzen, Christoph, et al.
Published: (2016) -
Scheduling and load balancing in parallel and distributed systems /
by: 235172 Shirazi, Behrooz A., et al.
Published: (1995)