Design of a high-throughput distributed shared-buffer NoC router
Router microarchitecture plays a central role in the performance of an on-chip network (NoC). Buffers are needed in routers to house incoming flits which cannot be immediately forwarded due to contention. This buffering can be done at the inputs or the outputs of a router, corresponding to an input-...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2012
|
Online Access: | http://hdl.handle.net/1721.1/72481 https://orcid.org/0000-0001-9010-6519 |
_version_ | 1811086281757687808 |
---|---|
author | Ramanujam, Rohit Sunkam Soteriou, Vassos Lin, Bill Peh, Li-Shiuan |
author2 | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
author_facet | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Ramanujam, Rohit Sunkam Soteriou, Vassos Lin, Bill Peh, Li-Shiuan |
author_sort | Ramanujam, Rohit Sunkam |
collection | MIT |
description | Router microarchitecture plays a central role in the performance of an on-chip network (NoC). Buffers are needed in routers to house incoming flits which cannot be immediately forwarded due to contention. This buffering can be done at the inputs or the outputs of a router, corresponding to an input-buffered router (IBR) or an output-buffered router (OBR). OBRs are attractive because they can sustain higher throughputs and have lower queuing delays under high loads than IBRs. However, a direct implementation of an OBR requires a router speedup equal to the number of ports, making such a design prohibitive under aggressive clocking needs and limited power budgets of most NoC applications. In this paper, we propose a new router design that aims to emulate an OBR practically, based on a distributed shared-buffer (DSB) router architecture. We introduce innovations to address the unique constraints of NoCs, including efficient pipelining and novel flow-control. We also present practical DSB configurations that can reduce the power overhead with negligible degradation in performance. The proposed DSB router achieves up to 19% higher throughput on synthetic traffic and reduces packet latency by 60% on average for SPLASH-2 benchmarks with high contention, compared to a state-of-art pipelined IBR. On average, the saturation throughput of DSB routers is within 10% of the theoretically ideal saturation throughput under the synthetic workloads evaluated. |
first_indexed | 2024-09-23T13:23:41Z |
format | Article |
id | mit-1721.1/72481 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:23:41Z |
publishDate | 2012 |
publisher | Institute of Electrical and Electronics Engineers (IEEE) |
record_format | dspace |
spelling | mit-1721.1/724812022-10-01T14:58:59Z Design of a high-throughput distributed shared-buffer NoC router Ramanujam, Rohit Sunkam Soteriou, Vassos Lin, Bill Peh, Li-Shiuan Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Peh, Li-Shiuan Peh, Li-Shiuan Router microarchitecture plays a central role in the performance of an on-chip network (NoC). Buffers are needed in routers to house incoming flits which cannot be immediately forwarded due to contention. This buffering can be done at the inputs or the outputs of a router, corresponding to an input-buffered router (IBR) or an output-buffered router (OBR). OBRs are attractive because they can sustain higher throughputs and have lower queuing delays under high loads than IBRs. However, a direct implementation of an OBR requires a router speedup equal to the number of ports, making such a design prohibitive under aggressive clocking needs and limited power budgets of most NoC applications. In this paper, we propose a new router design that aims to emulate an OBR practically, based on a distributed shared-buffer (DSB) router architecture. We introduce innovations to address the unique constraints of NoCs, including efficient pipelining and novel flow-control. We also present practical DSB configurations that can reduce the power overhead with negligible degradation in performance. The proposed DSB router achieves up to 19% higher throughput on synthetic traffic and reduces packet latency by 60% on average for SPLASH-2 benchmarks with high contention, compared to a state-of-art pipelined IBR. On average, the saturation throughput of DSB routers is within 10% of the theoretically ideal saturation throughput under the synthetic workloads evaluated. National Science Foundation (U.S.). (Grant number CCF-0702341) 2012-08-30T18:26:36Z 2012-08-30T18:26:36Z 2010-07 2010-05 Article http://purl.org/eprint/type/ConferencePaper 978-1-4244-7086-0 978-1-4244-7085-3 http://hdl.handle.net/1721.1/72481 Ramanujam, Rohit Sunkam et al. “Design of a High-Throughput Distributed Shared-Buffer NoC Router.” Fourth ACM/IEEE International Symposium on Networks-on-Chip 2010 (NOCS). 69–78. © Copyright 2010 IEEE https://orcid.org/0000-0001-9010-6519 en_US http://dx.doi.org/10.1109/NOCS.2010.17 Fourth ACM/IEEE International Symposium on Networks-on-Chip 2010 (NOCS) Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Institute of Electrical and Electronics Engineers (IEEE) IEEE |
spellingShingle | Ramanujam, Rohit Sunkam Soteriou, Vassos Lin, Bill Peh, Li-Shiuan Design of a high-throughput distributed shared-buffer NoC router |
title | Design of a high-throughput distributed shared-buffer NoC router |
title_full | Design of a high-throughput distributed shared-buffer NoC router |
title_fullStr | Design of a high-throughput distributed shared-buffer NoC router |
title_full_unstemmed | Design of a high-throughput distributed shared-buffer NoC router |
title_short | Design of a high-throughput distributed shared-buffer NoC router |
title_sort | design of a high throughput distributed shared buffer noc router |
url | http://hdl.handle.net/1721.1/72481 https://orcid.org/0000-0001-9010-6519 |
work_keys_str_mv | AT ramanujamrohitsunkam designofahighthroughputdistributedsharedbuffernocrouter AT soteriouvassos designofahighthroughputdistributedsharedbuffernocrouter AT linbill designofahighthroughputdistributedsharedbuffernocrouter AT pehlishiuan designofahighthroughputdistributedsharedbuffernocrouter |