Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system

Analog resistive random-access memory (RRAM)-based computation-in-memory (CIM) technology is promising for constructing artificial intelligence (AI) with high energy efficiency and excellent scalability. However, the large overhead of analog-to-digital converters (ADCs) is a key limitation. In this...

Full description

Bibliographic Details
Main Authors: Yuyi Liu, Bin Gao, Peng Yao, Qi Liu, Qingtian Zhang, Dong Wu, Jianshi Tang, He Qian, Huaqiang Wu
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-04-01
Series:Frontiers in Electronics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/felec.2023.1129675/full
_version_ 1797845424129703936
author Yuyi Liu
Bin Gao
Peng Yao
Qi Liu
Qingtian Zhang
Dong Wu
Jianshi Tang
He Qian
Huaqiang Wu
author_facet Yuyi Liu
Bin Gao
Peng Yao
Qi Liu
Qingtian Zhang
Dong Wu
Jianshi Tang
He Qian
Huaqiang Wu
author_sort Yuyi Liu
collection DOAJ
description Analog resistive random-access memory (RRAM)-based computation-in-memory (CIM) technology is promising for constructing artificial intelligence (AI) with high energy efficiency and excellent scalability. However, the large overhead of analog-to-digital converters (ADCs) is a key limitation. In this work, we propose a novel LINKAGE architecture that eliminates PE-level ADCs and leverages an analog data transfer module to implement inter-array data processing. A blockwise dataflow is further proposed to accelerate convolutional neural networks (CNNs) to speed up compute-intensive layers and solve the unbalanced pipeline problem. To obtain accurate and reliable benchmark results, key component modules, such as straightforward link (SFL) modules and Tile-level ADCs, are designed in standard 28 nm CMOS technology. The evaluation shows that LINKAGE outperforms the conventional ADC/DAC-based architecture with a 2.07×∼11.22× improvement in throughput, 2.45×∼7.00× in energy efficiency, and 22%–51% reduction in the area overhead while maintaining accuracy. Our LINKAGE architecture can achieve 22.9∼24.4 TOPS/W energy efficiency (4b-IN/4b-W) and 1.82 ∼4.53 TOPS throughput with the blockwise method. This work demonstrates a new method for significantly improving the energy efficiency of CIM chips, which can be applied to general CNNs/FCNNs.
first_indexed 2024-04-09T17:38:47Z
format Article
id doaj.art-b8e08c35656a4df083fbc440a9c51cb4
institution Directory Open Access Journal
issn 2673-5857
language English
last_indexed 2024-04-09T17:38:47Z
publishDate 2023-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Electronics
spelling doaj.art-b8e08c35656a4df083fbc440a9c51cb42023-04-17T08:49:38ZengFrontiers Media S.A.Frontiers in Electronics2673-58572023-04-01410.3389/felec.2023.11296751129675Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM systemYuyi LiuBin GaoPeng YaoQi LiuQingtian ZhangDong WuJianshi TangHe QianHuaqiang WuAnalog resistive random-access memory (RRAM)-based computation-in-memory (CIM) technology is promising for constructing artificial intelligence (AI) with high energy efficiency and excellent scalability. However, the large overhead of analog-to-digital converters (ADCs) is a key limitation. In this work, we propose a novel LINKAGE architecture that eliminates PE-level ADCs and leverages an analog data transfer module to implement inter-array data processing. A blockwise dataflow is further proposed to accelerate convolutional neural networks (CNNs) to speed up compute-intensive layers and solve the unbalanced pipeline problem. To obtain accurate and reliable benchmark results, key component modules, such as straightforward link (SFL) modules and Tile-level ADCs, are designed in standard 28 nm CMOS technology. The evaluation shows that LINKAGE outperforms the conventional ADC/DAC-based architecture with a 2.07×∼11.22× improvement in throughput, 2.45×∼7.00× in energy efficiency, and 22%–51% reduction in the area overhead while maintaining accuracy. Our LINKAGE architecture can achieve 22.9∼24.4 TOPS/W energy efficiency (4b-IN/4b-W) and 1.82 ∼4.53 TOPS throughput with the blockwise method. This work demonstrates a new method for significantly improving the energy efficiency of CIM chips, which can be applied to general CNNs/FCNNs.https://www.frontiersin.org/articles/10.3389/felec.2023.1129675/fullcomputation-in-memoryresistive random-access memorycomputing-intensivestraightforward linkenergy efficiencythroughput
spellingShingle Yuyi Liu
Bin Gao
Peng Yao
Qi Liu
Qingtian Zhang
Dong Wu
Jianshi Tang
He Qian
Huaqiang Wu
Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system
Frontiers in Electronics
computation-in-memory
resistive random-access memory
computing-intensive
straightforward link
energy efficiency
throughput
title Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system
title_full Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system
title_fullStr Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system
title_full_unstemmed Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system
title_short Straightforward data transfer in a blockwise dataflow for an analog RRAM-based CIM system
title_sort straightforward data transfer in a blockwise dataflow for an analog rram based cim system
topic computation-in-memory
resistive random-access memory
computing-intensive
straightforward link
energy efficiency
throughput
url https://www.frontiersin.org/articles/10.3389/felec.2023.1129675/full
work_keys_str_mv AT yuyiliu straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT bingao straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT pengyao straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT qiliu straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT qingtianzhang straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT dongwu straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT jianshitang straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT heqian straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem
AT huaqiangwu straightforwarddatatransferinablockwisedataflowforananalogrrambasedcimsystem