Improving network connection locality on multicore systems

Incoming and outgoing processing for a given TCP connection often execute on different cores: an incoming packet is typically processed on the core that receives the interrupt, while outgoing data processing occurs on the core running the relevant user code. As a result, accesses to read/write conne...

Full description

Bibliographic Details
Main Authors: Pesterev, Aleksey, Strauss, Jacob, Zeldovich, Nickolai, Morris, Robert Tappan
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Association for Computing Machinery (ACM) 2012
Online Access:http://hdl.handle.net/1721.1/72689
https://orcid.org/0000-0003-0238-2703
https://orcid.org/0000-0003-2700-9286
_version_ 1811075333605031936
author Pesterev, Aleksey
Strauss, Jacob
Zeldovich, Nickolai
Morris, Robert Tappan
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Pesterev, Aleksey
Strauss, Jacob
Zeldovich, Nickolai
Morris, Robert Tappan
author_sort Pesterev, Aleksey
collection MIT
description Incoming and outgoing processing for a given TCP connection often execute on different cores: an incoming packet is typically processed on the core that receives the interrupt, while outgoing data processing occurs on the core running the relevant user code. As a result, accesses to read/write connection state (such as TCP control blocks) often involve cache invalidations and data movement between cores' caches. These can take hundreds of processor cycles, enough to significantly reduce performance. We present a new design, called Affinity-Accept, that causes all processing for a given TCP connection to occur on the same core. Affinity-Accept arranges for the network interface to determine the core on which application processing for each new connection occurs, in a lightweight way; it adjusts the card's choices only in response to imbalances in CPU scheduling. Measurements show that for the Apache web server serving static files on a 48-core AMD system, Affinity-Accept reduces time spent in the TCP stack by 30% and improves overall throughput by 24%.
first_indexed 2024-09-23T10:04:23Z
format Article
id mit-1721.1/72689
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T10:04:23Z
publishDate 2012
publisher Association for Computing Machinery (ACM)
record_format dspace
spelling mit-1721.1/726892022-09-30T18:44:20Z Improving network connection locality on multicore systems Pesterev, Aleksey Strauss, Jacob Zeldovich, Nickolai Morris, Robert Tappan Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Morris, Robert Tappan Pesterev, Aleksey Zeldovich, Nickolai Morris, Robert Tappan Incoming and outgoing processing for a given TCP connection often execute on different cores: an incoming packet is typically processed on the core that receives the interrupt, while outgoing data processing occurs on the core running the relevant user code. As a result, accesses to read/write connection state (such as TCP control blocks) often involve cache invalidations and data movement between cores' caches. These can take hundreds of processor cycles, enough to significantly reduce performance. We present a new design, called Affinity-Accept, that causes all processing for a given TCP connection to occur on the same core. Affinity-Accept arranges for the network interface to determine the core on which application processing for each new connection occurs, in a lightweight way; it adjusts the card's choices only in response to imbalances in CPU scheduling. Measurements show that for the Apache web server serving static files on a 48-core AMD system, Affinity-Accept reduces time spent in the TCP stack by 30% and improves overall throughput by 24%. National Science Foundation (U.S.). (Grant number CNS-0834415) National Science Foundation (U.S.). (Grant number CNS-0915164) Quanta Computer (Firm) 2012-09-13T15:51:11Z 2012-09-13T15:51:11Z 2012-04 Article http://purl.org/eprint/type/ConferencePaper 978-1-4503-1223-3 http://hdl.handle.net/1721.1/72689 Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T. Morris. 2012. Improving network connection locality on multicore systems. In Proceedings of the 7th ACM european conference on Computer Systems (EuroSys '12). ACM, New York, NY, USA, 337-350. https://orcid.org/0000-0003-0238-2703 https://orcid.org/0000-0003-2700-9286 en_US http://dx.doi.org/10.1145/2168836.2168870 Proceedings of the 7th ACM european conference on Computer Systems (EuroSys '12) Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Association for Computing Machinery (ACM) MIT web domain
spellingShingle Pesterev, Aleksey
Strauss, Jacob
Zeldovich, Nickolai
Morris, Robert Tappan
Improving network connection locality on multicore systems
title Improving network connection locality on multicore systems
title_full Improving network connection locality on multicore systems
title_fullStr Improving network connection locality on multicore systems
title_full_unstemmed Improving network connection locality on multicore systems
title_short Improving network connection locality on multicore systems
title_sort improving network connection locality on multicore systems
url http://hdl.handle.net/1721.1/72689
https://orcid.org/0000-0003-0238-2703
https://orcid.org/0000-0003-2700-9286
work_keys_str_mv AT pesterevaleksey improvingnetworkconnectionlocalityonmulticoresystems
AT straussjacob improvingnetworkconnectionlocalityonmulticoresystems
AT zeldovichnickolai improvingnetworkconnectionlocalityonmulticoresystems
AT morrisroberttappan improvingnetworkconnectionlocalityonmulticoresystems