GraphPipe: Improving the Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Deep neural networks (DNNs) continue to grow rapidly in size, thus it is infeasible to train them on a single device. To address this challenge, current DNN training systems apply pipeline-parallel techniques. They split a DNN into multiple stages, construct a pipeline of them, and assign to each st...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/156292 |