Monkey: Platform-Agnostic Hybrid-Cloud Cluster Compute Orchestration Designed for AI/ML

As AI/ML research progresses, the amount of compute needed to train and evaluate state-of-the-art AI algorithms consistently increases. With increasing needs for compute, researchers spend time designing distributed systems to scalably train and hyper-parameter optimize their latest model rather tha...

Full description

Bibliographic Details
Main Author: Lamp, Avery
Other Authors: Agrawal, Pulkit
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139258