Load balancing and memory optimizations for expert parallel training of large language models

Load balancing and memory optimizations for expert parallel training of large language models

Large language models (LLMs) are an effective way to solve many text-based machine learning tasks, but require huge amounts of computation to train and evaluate. Mixture of experts models have emerged as a way to reduce the amount of computation required for LLMs without compromising accuracy. It is...

Full description

Bibliographic Details
Main Author:	Wisdom, Daniel
Other Authors:	Leiserson, Charles E.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2024
Online Access:	https://hdl.handle.net/1721.1/153897

Similar Items

Effects of Parallel Processing Implementation on Balanced Load-Division Depending on Distributed Memory Systems
by: Subhi R. Zebari, et al.
Published: (2012-06-01)

Deep Learning Compiler Load Balancing Optimization Method for Model Training
by: WANG Li, GAO Kai, ZHAO Yaqian, LI Rengang, CAO Fang, GUO Zhenhua
Published: (2024-01-01)

Load Balancing Based on Firefly and Ant Colony Optimization Algorithms for Parallel Computing
by: Yong Li, et al.
Published: (2022-10-01)

Tight bounds for parallel randomized load balancing
by: Lenzen, Christoph, et al.
Published: (2016)

Scheduling and load balancing in parallel and distributed systems /
by: 235172 Shirazi, Behrooz A., et al.
Published: (1995)

Expert evaluation of large language models for clinical dialogue summarization
by: David Fraile Navarro, et al.
Published: (2025-01-01)

A Parallel Model for Dynamic Load Balancing in Clustered Distributed Systems
by: Calinescu, R, et al.
Published: (1994)

A Parallel Model for Dynamic Load Balancing in Clustered Distributed Systems
by: Calinescu, R, et al.
Published: (1993)

Processor load balancing for parallel branch and bound algorithms
by: Milda Baravykaitė
Published: (2005-12-01)

Parallel processing. Dynamic Load Balance in Sorting Algorithms
by: Marcelo Naiouf
Published: (2004-10-01)

Large language models surpass human experts in predicting neuroscience results
by: Luo, X, et al.
Published: (2024)

Mixture of Expert Large Language Model for Legal Case Element Recognition
by: YIN Hua, WU Zihao, LIU Tingting, ZHANG Jiajia, GAO Ziqian
Published: (2024-12-01)

Load Balancing of Large Distribution Network Model Calculations
by: MARTINOVIC, L., et al.
Published: (2017-11-01)

Academic expert finding using BERT pre-trained language model
by: Ilma Alpha Mannix, et al.
Published: (2024-05-01)

Dynamic load balancing in parallel processing on non-homogeneous clusters
by: Armando Eduardo De Giusti, et al.
Published: (2005-12-01)

Dynamic Load Balancing Strategy for Parallel Tumor Growth Simulations
by: Salguero Alberto G., et al.
Published: (2019-02-01)

Load-Balancing Models for Scheduling Divisible Load on Large Scale Data Grids
by: Abduh Kaid, Monir Abdullah
Published: (2009)

Large language models as a substitute for human experts in annotating political text
by: Michael Heseltine, et al.
Published: (2024-02-01)

Modeling an Optimized Approach for Load Balancing in Cloud
by: Muhammad Junaid, et al.
Published: (2020-01-01)

Interpreting and Editing Memory in Large Transformer Language Models
by: Meng, Kevin
Published: (2024)

Accelerated evidence synthesis in orthopaedics—the roles of natural language processing, expert annotation and large language models
by: Bálint Zsidai, et al.
Published: (2023-01-01)

Balancing Privacy and Robustness in Prompt Learning for Large Language Models
by: Chiyu Shi, et al.
Published: (2024-10-01)

The Parallel Persistent Memory Model
by: Blelloch, Guy E, et al.
Published: (2021)

Parallel anisotropic mesh adaptivity with dynamic load balancing for cardiac electrophysiology
by: Southern, J, et al.
Published: (2012)

Image Segmentation With Cyclic Load Balanced Parallel Fuzzy C - Means.
by: Vadiveloo, Mogana
Published: (2010)

Load-balancing and synchronisation for parallel agent-based road traffic simulation
by: Xu, Yadong
Published: (2017)

Load-Balanced Parallel Implementation on GPUs for Multi-Scalar Multiplication Algorithm
by: Yutian Chen, et al.
Published: (2024-03-01)

PUC: parallel mining of high-utility itemsets with load balancing on spark
by: Brahmavar Anup Bhat, et al.
Published: (2022-05-01)

Efficient inference offloading for mixture-of-experts large language models in internet of medical things
by: Yuan, Xiaoming, et al.
Published: (2024)

Leave It to Large Language Models! Correction and Planning with Memory Integration
by: Yuan Zhang, et al.
Published: (2024-01-01)

Prompt Optimization in Large Language Models
by: Antonio Sabbatella, et al.
Published: (2024-03-01)

Parallel Teaching and Evaluation of Employability and Foreign Language Competencies in Technical Language Training
by: Zita Hajdu, et al.
Published: (2023-06-01)

Sparse Distributed Memory: understanding the speed and robustness of expert memory
by: Marcelo Salhab Brogliato, et al.
Published: (2014-04-01)

Using cellular automata for parallel simulation of laser dynamics with dynamic load balancing
by: Guisado, J. L., et al.
Published: (2013)

LAMP: load-balanced multipath parallel transmission in point-to-point NoCs
by: Chen, Hui, et al.
Published: (2022)

A grid-based virtual reactor : parallel performance and adaptive load balancing
by: Korkhov, Vladimir V., et al.
Published: (2013)

Load Balancing Strategies for Slice-Based Parallel Versions of JEM Video Encoder
by: Héctor Migallón, et al.
Published: (2021-11-01)

Large language models’ expert-level global history knowledge benchmark (HiST-LLM)
by: Hauser, J, et al.
Published: (2025)

Deep Parallel Optimizations on an LASG/IAP Climate System Ocean Model and Its Large-Scale Parallelization
by: Huiqun Hao, et al.
Published: (2023-02-01)

Advanced parallel implementation of the coupled ocean–ice model FEMAO (version 2.0) with load balancing
by: P. Perezhogin, et al.
Published: (2021-02-01)