Carry-Propagation-Adder-Factored Gemmini Systolic Array for Machine Learning Acceleration

Systolic arrays are the primary part of modern deep learning accelerators and are being used widely in real-life applications such as self-driving cars. This paper presents a novel factored systolic array, where the carry propagation adder for accumulation and the rounding logic are extracted out fr...

Full description

Bibliographic Details
Main Authors: Kashif Inayat, Jaeyong Chung
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/10/6/652
Description
Summary:Systolic arrays are the primary part of modern deep learning accelerators and are being used widely in real-life applications such as self-driving cars. This paper presents a novel factored systolic array, where the carry propagation adder for accumulation and the rounding logic are extracted out from each processing element, which reduces the area, power and delay of the processing elements substantially. The factoring is performed in the column-wise manner and the cost of the factored logic, placed at each column output, is amortized by the processing elements in a column. We demonstrate the proposed factoring in an open source systolic array, Gemmini. The factoring technique does not change the functionality of the base design and is transparent to applications. We show that the proposed technique leads to substantial reduction in area and delay up to 45.3% and 23.7%, respectively, compared to the Gemmini baseline.
ISSN:2079-9292