Digital in-memory stochastic computing architecture for vector-matrix multiplication

The applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The...

Full description

Bibliographic Details
Main Authors: Shady Agwa, Themis Prodromakis
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-07-01
Series:Frontiers in Nanotechnology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnano.2023.1147396/full
_version_ 1797773559022485504
author Shady Agwa
Themis Prodromakis
author_facet Shady Agwa
Themis Prodromakis
author_sort Shady Agwa
collection DOAJ
description The applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The vector-matrix multiplication cost, in the binary domain, is a major computational bottleneck for these applications. This paper introduces a novel digital in-memory stochastic computing architecture that leverages the simplicity of the stochastic computing for in-memory vector-matrix multiplication. The proposed architecture incorporates several new approaches including a new stochastic number generator with ideal binary-to-stochastic mapping, a best seeding approach for accurate-enough low stochastic bit-precisions, a hybrid stochastic-binary accumulation approach for vector-matrix multiplication, and the conversion of conventional memory read operations into on-the-fly stochastic multiplication operations with negligible overhead. Thanks to the combination of these approaches, the accuracy analysis of the vector-matrix multiplication benchmark shows that scaling down the stochastic bit-precision from 16-bit to 4-bit achieves nearly the same average error (less than 3%). The derived analytical model of the proposed in-memory stochastic computing architecture demonstrates that the 4-bit stochastic architecture achieves the highest throughput per sub-array (122 Ops/Cycle), which is better than the 16-bit stochastic precision by 4.36x, while still maintaining a small average error of 2.25%.
first_indexed 2024-03-12T22:08:14Z
format Article
id doaj.art-9912dfa7dd734871a46011e8bc083200
institution Directory Open Access Journal
issn 2673-3013
language English
last_indexed 2024-03-12T22:08:14Z
publishDate 2023-07-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Nanotechnology
spelling doaj.art-9912dfa7dd734871a46011e8bc0832002023-07-24T09:23:26ZengFrontiers Media S.A.Frontiers in Nanotechnology2673-30132023-07-01510.3389/fnano.2023.11473961147396Digital in-memory stochastic computing architecture for vector-matrix multiplicationShady AgwaThemis ProdromakisThe applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The vector-matrix multiplication cost, in the binary domain, is a major computational bottleneck for these applications. This paper introduces a novel digital in-memory stochastic computing architecture that leverages the simplicity of the stochastic computing for in-memory vector-matrix multiplication. The proposed architecture incorporates several new approaches including a new stochastic number generator with ideal binary-to-stochastic mapping, a best seeding approach for accurate-enough low stochastic bit-precisions, a hybrid stochastic-binary accumulation approach for vector-matrix multiplication, and the conversion of conventional memory read operations into on-the-fly stochastic multiplication operations with negligible overhead. Thanks to the combination of these approaches, the accuracy analysis of the vector-matrix multiplication benchmark shows that scaling down the stochastic bit-precision from 16-bit to 4-bit achieves nearly the same average error (less than 3%). The derived analytical model of the proposed in-memory stochastic computing architecture demonstrates that the 4-bit stochastic architecture achieves the highest throughput per sub-array (122 Ops/Cycle), which is better than the 16-bit stochastic precision by 4.36x, while still maintaining a small average error of 2.25%.https://www.frontiersin.org/articles/10.3389/fnano.2023.1147396/fullstochastic computingin-memory computingbeyond von-neumann architecturesvector-matrix multiplicationRRAMdeep neural network
spellingShingle Shady Agwa
Themis Prodromakis
Digital in-memory stochastic computing architecture for vector-matrix multiplication
Frontiers in Nanotechnology
stochastic computing
in-memory computing
beyond von-neumann architectures
vector-matrix multiplication
RRAM
deep neural network
title Digital in-memory stochastic computing architecture for vector-matrix multiplication
title_full Digital in-memory stochastic computing architecture for vector-matrix multiplication
title_fullStr Digital in-memory stochastic computing architecture for vector-matrix multiplication
title_full_unstemmed Digital in-memory stochastic computing architecture for vector-matrix multiplication
title_short Digital in-memory stochastic computing architecture for vector-matrix multiplication
title_sort digital in memory stochastic computing architecture for vector matrix multiplication
topic stochastic computing
in-memory computing
beyond von-neumann architectures
vector-matrix multiplication
RRAM
deep neural network
url https://www.frontiersin.org/articles/10.3389/fnano.2023.1147396/full
work_keys_str_mv AT shadyagwa digitalinmemorystochasticcomputingarchitectureforvectormatrixmultiplication
AT themisprodromakis digitalinmemorystochasticcomputingarchitectureforvectormatrixmultiplication