Optimizing Single DGX-A100 System: Overcoming GPU Limitations via Efficient Parallelism and Scheduling for Large Language Models

In this study, we introduce a novel training algorithm specifically designed to overcome the limitations of GPU memory on a single DGX-A100 system. By utilizing the CPU and main memory in the training process and applying a strategy of division and parallelization, our algorithm enhances the size of...

Full description

Bibliographic Details
Main Authors: Kyeong-Hwan Kim, Chang-Sung Jeong
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/16/9306

Similar Items