Zeus: interpretable ML-based job scheduling in GPU datacentres

Hardware accelerators such as GPUs are essential for the development of Deep Learning (DL) models - as their training process is compute-intensive. A growing number of organisations have employed expensive multi-tenant GPU clusters to run distributed DL training jobs. Efficient job schedulers are re...

Mô tả đầy đủ

Chi tiết về thư mục
Tác giả chính: Amrita, Ravishankar
Tác giả khác: Zhang Tianwei
Định dạng: Final Year Project (FYP)
Ngôn ngữ:English
Được phát hành: Nanyang Technological University 2022
Những chủ đề:
Truy cập trực tuyến:https://hdl.handle.net/10356/156566