GraphPipe: Improving the Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Deep neural networks (DNNs) continue to grow rapidly in size, thus it is infeasible to train them on a single device. To address this challenge, current DNN training systems apply pipeline-parallel techniques. They split a DNN into multiple stages, construct a pipeline of them, and assign to each st...
Main Author: | Kim, Sunghyun |
---|---|
Other Authors: | Alizadeh, Mohammad |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/156292 |
Similar Items
-
A bidirectional DNN partition mechanism for efficient pipeline parallel training in cloud
by: Lingyun Cui, et al.
Published: (2023-02-01) -
Scalable parallel and distributed simulation of an epidemic on a graph.
by: Guohao Dou
Published: (2023-01-01) -
Scalable parallel and distributed simulation of an epidemic on a graph
by: Guohao Dou
Published: (2023-01-01) -
TAPP: DNN Training for Task Allocation through Pipeline Parallelism Based on Distributed Deep Reinforcement Learning
by: Yingchi Mao, et al.
Published: (2021-05-01) -
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
by: Dhulipala, Laxman, et al.
Published: (2021)