Monkey: Platform-Agnostic Hybrid-Cloud Cluster Compute Orchestration Designed for AI/ML
As AI/ML research progresses, the amount of compute needed to train and evaluate state-of-the-art AI algorithms consistently increases. With increasing needs for compute, researchers spend time designing distributed systems to scalably train and hyper-parameter optimize their latest model rather tha...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/139258 |