Digital in-memory stochastic computing architecture for vector-matrix multiplication
The applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-07-01
|
Series: | Frontiers in Nanotechnology |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fnano.2023.1147396/full |
_version_ | 1797773559022485504 |
---|---|
author | Shady Agwa Themis Prodromakis |
author_facet | Shady Agwa Themis Prodromakis |
author_sort | Shady Agwa |
collection | DOAJ |
description | The applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The vector-matrix multiplication cost, in the binary domain, is a major computational bottleneck for these applications. This paper introduces a novel digital in-memory stochastic computing architecture that leverages the simplicity of the stochastic computing for in-memory vector-matrix multiplication. The proposed architecture incorporates several new approaches including a new stochastic number generator with ideal binary-to-stochastic mapping, a best seeding approach for accurate-enough low stochastic bit-precisions, a hybrid stochastic-binary accumulation approach for vector-matrix multiplication, and the conversion of conventional memory read operations into on-the-fly stochastic multiplication operations with negligible overhead. Thanks to the combination of these approaches, the accuracy analysis of the vector-matrix multiplication benchmark shows that scaling down the stochastic bit-precision from 16-bit to 4-bit achieves nearly the same average error (less than 3%). The derived analytical model of the proposed in-memory stochastic computing architecture demonstrates that the 4-bit stochastic architecture achieves the highest throughput per sub-array (122 Ops/Cycle), which is better than the 16-bit stochastic precision by 4.36x, while still maintaining a small average error of 2.25%. |
first_indexed | 2024-03-12T22:08:14Z |
format | Article |
id | doaj.art-9912dfa7dd734871a46011e8bc083200 |
institution | Directory Open Access Journal |
issn | 2673-3013 |
language | English |
last_indexed | 2024-03-12T22:08:14Z |
publishDate | 2023-07-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Nanotechnology |
spelling | doaj.art-9912dfa7dd734871a46011e8bc0832002023-07-24T09:23:26ZengFrontiers Media S.A.Frontiers in Nanotechnology2673-30132023-07-01510.3389/fnano.2023.11473961147396Digital in-memory stochastic computing architecture for vector-matrix multiplicationShady AgwaThemis ProdromakisThe applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The vector-matrix multiplication cost, in the binary domain, is a major computational bottleneck for these applications. This paper introduces a novel digital in-memory stochastic computing architecture that leverages the simplicity of the stochastic computing for in-memory vector-matrix multiplication. The proposed architecture incorporates several new approaches including a new stochastic number generator with ideal binary-to-stochastic mapping, a best seeding approach for accurate-enough low stochastic bit-precisions, a hybrid stochastic-binary accumulation approach for vector-matrix multiplication, and the conversion of conventional memory read operations into on-the-fly stochastic multiplication operations with negligible overhead. Thanks to the combination of these approaches, the accuracy analysis of the vector-matrix multiplication benchmark shows that scaling down the stochastic bit-precision from 16-bit to 4-bit achieves nearly the same average error (less than 3%). The derived analytical model of the proposed in-memory stochastic computing architecture demonstrates that the 4-bit stochastic architecture achieves the highest throughput per sub-array (122 Ops/Cycle), which is better than the 16-bit stochastic precision by 4.36x, while still maintaining a small average error of 2.25%.https://www.frontiersin.org/articles/10.3389/fnano.2023.1147396/fullstochastic computingin-memory computingbeyond von-neumann architecturesvector-matrix multiplicationRRAMdeep neural network |
spellingShingle | Shady Agwa Themis Prodromakis Digital in-memory stochastic computing architecture for vector-matrix multiplication Frontiers in Nanotechnology stochastic computing in-memory computing beyond von-neumann architectures vector-matrix multiplication RRAM deep neural network |
title | Digital in-memory stochastic computing architecture for vector-matrix multiplication |
title_full | Digital in-memory stochastic computing architecture for vector-matrix multiplication |
title_fullStr | Digital in-memory stochastic computing architecture for vector-matrix multiplication |
title_full_unstemmed | Digital in-memory stochastic computing architecture for vector-matrix multiplication |
title_short | Digital in-memory stochastic computing architecture for vector-matrix multiplication |
title_sort | digital in memory stochastic computing architecture for vector matrix multiplication |
topic | stochastic computing in-memory computing beyond von-neumann architectures vector-matrix multiplication RRAM deep neural network |
url | https://www.frontiersin.org/articles/10.3389/fnano.2023.1147396/full |
work_keys_str_mv | AT shadyagwa digitalinmemorystochasticcomputingarchitectureforvectormatrixmultiplication AT themisprodromakis digitalinmemorystochasticcomputingarchitectureforvectormatrixmultiplication |