Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control

Network links and server CPUs are heavily contended resources in modern datacenters. To keep tail latencies low, datacenter operators drastically overprovision both types of resources today, and there has been significant research into effectively managing network traffic [4, 19, 21, 29] and CPU loa...

Full description

Bibliographic Details
Main Authors: Ousterhout, Amy Elizabeth, Belay, Adam M, Zhang, I
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: USENIX Association 2020
Online Access:https://hdl.handle.net/1721.1/128726
_version_ 1826198440173895680
author Ousterhout, Amy Elizabeth
Belay, Adam M
Zhang, I
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Ousterhout, Amy Elizabeth
Belay, Adam M
Zhang, I
author_sort Ousterhout, Amy Elizabeth
collection MIT
description Network links and server CPUs are heavily contended resources in modern datacenters. To keep tail latencies low, datacenter operators drastically overprovision both types of resources today, and there has been significant research into effectively managing network traffic [4, 19, 21, 29] and CPU load [22, 27, 32]. However, this work typically looks at the two resources in isolation. In this paper, we make the observation that, in the datacenter, the allocation of network and CPU resources should be co-designed for the most efficiency and the best response times. For example, while congestion control protocols can prioritize traffic from certain flows, this provides no benefit if the traffic arrives at an overloaded server that will only queue the request. This paper explores the potential benefits of such a co-designed resource allocator and considers the recent work in both CPU scheduling and congestion control that is best suited to such a system. We propose a Chimera, a new datacenter OS that integrates a receiver-based congestion control protocol with OS insight into application queues, using the recent Shenango operating system [32].
first_indexed 2024-09-23T11:04:59Z
format Article
id mit-1721.1/128726
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T11:04:59Z
publishDate 2020
publisher USENIX Association
record_format dspace
spelling mit-1721.1/1287262022-09-27T17:00:50Z Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control Ousterhout, Amy Elizabeth Belay, Adam M Zhang, I Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Network links and server CPUs are heavily contended resources in modern datacenters. To keep tail latencies low, datacenter operators drastically overprovision both types of resources today, and there has been significant research into effectively managing network traffic [4, 19, 21, 29] and CPU load [22, 27, 32]. However, this work typically looks at the two resources in isolation. In this paper, we make the observation that, in the datacenter, the allocation of network and CPU resources should be co-designed for the most efficiency and the best response times. For example, while congestion control protocols can prioritize traffic from certain flows, this provides no benefit if the traffic arrives at an overloaded server that will only queue the request. This paper explores the potential benefits of such a co-designed resource allocator and considers the recent work in both CPU scheduling and congestion control that is best suited to such a system. We propose a Chimera, a new datacenter OS that integrates a receiver-based congestion control protocol with OS insight into application queues, using the recent Shenango operating system [32]. 2020-12-03T18:46:07Z 2020-12-03T18:46:07Z 2019-07 2020-12-01T18:33:54Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/128726 Ousterhout, Amy et al. "Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control." 11th USENIX Workshop on Hot Topics in Cloud Computing, HotCloud 2019, co-located with USENIX ATC 2019, July 2019, Renton, Washington, USENIX Association, July 2019. © 2019 USENIX Association en https://www.usenix.org/conference/hotcloud19/presentation/ousterhout 11th USENIX Workshop on Hot Topics in Cloud Computing, HotCloud 2019, co-located with USENIX ATC 2019 Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf USENIX Association other univ website
spellingShingle Ousterhout, Amy Elizabeth
Belay, Adam M
Zhang, I
Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control
title Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control
title_full Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control
title_fullStr Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control
title_full_unstemmed Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control
title_short Just in time delivery: Leveraging operating systems knowledge for better datacenter congestion control
title_sort just in time delivery leveraging operating systems knowledge for better datacenter congestion control
url https://hdl.handle.net/1721.1/128726
work_keys_str_mv AT ousterhoutamyelizabeth justintimedeliveryleveragingoperatingsystemsknowledgeforbetterdatacentercongestioncontrol
AT belayadamm justintimedeliveryleveragingoperatingsystemsknowledgeforbetterdatacentercongestioncontrol
AT zhangi justintimedeliveryleveragingoperatingsystemsknowledgeforbetterdatacentercongestioncontrol