Towards optimal scheduling of deep learning training jobs in GPU clusters

Deep Learning (DL) manifests as a groundbreaking technology, revolutionizing numerous fields. This paradigm shift has fueled an ever-growing demand for training DL models, leading to the development of hyperscale GPU clusters. Despite their massive computational power, these clusters often struggle...

Full description

Bibliographic Details
Main Author: Gao, Wei
Other Authors: Zhang Tianwei
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182633