Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems
Emerging resistive random-access memory (ReRAM) has demonstrated great potential in the achievement of the in-memory computing paradigm to overcome the well-known “memory wall” in current von Neumann architecture. The ReRAM crossbar array (RCA) is a promising circuit structure to accelerate the vita...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-04-01
|
Series: | Micromachines |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-666X/13/5/667 |
_version_ | 1827667821630324736 |
---|---|
author | Chenglong Huang Nuo Xu Wenqing Wang Yihong Hu Liang Fang |
author_facet | Chenglong Huang Nuo Xu Wenqing Wang Yihong Hu Liang Fang |
author_sort | Chenglong Huang |
collection | DOAJ |
description | Emerging resistive random-access memory (ReRAM) has demonstrated great potential in the achievement of the in-memory computing paradigm to overcome the well-known “memory wall” in current von Neumann architecture. The ReRAM crossbar array (RCA) is a promising circuit structure to accelerate the vital multiplication-and-accumulation (MAC) operations in deep neural networks (DNN). However, due to the nonlinear distribution of conductance levels in ReRAM, a large deviation exists in the mapping process when the trained weights that are quantized by linear relationships are directly mapped to the nonlinear conductance values from the realistic ReRAM device. This deviation degrades the inference accuracy of the RCA-based DNN. In this paper, we propose a minimum error substitution based on a conductance-aware quantization method to eliminate the deviation in the mapping process from the weights to the actual conductance values. The method is suitable for multiple ReRAM devices with different non-linear conductance distribution and is also immune to the device variation. The simulation results on LeNet5, AlexNet and VGG16 demonstrate that this method can vastly rescue the accuracy degradation from the non-linear resistance distribution of ReRAM devices compared to the linear quantization method. |
first_indexed | 2024-03-10T03:24:39Z |
format | Article |
id | doaj.art-02f85cecfba14937964d585e936a300a |
institution | Directory Open Access Journal |
issn | 2072-666X |
language | English |
last_indexed | 2024-03-10T03:24:39Z |
publishDate | 2022-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Micromachines |
spelling | doaj.art-02f85cecfba14937964d585e936a300a2023-11-23T12:11:18ZengMDPI AGMicromachines2072-666X2022-04-0113566710.3390/mi13050667Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing SystemsChenglong Huang0Nuo Xu1Wenqing Wang2Yihong Hu3Liang Fang4Institute for Quantum Information & State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer, National University of Defense Technology, Changsha 410073, ChinaInstitute for Quantum Information & State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, ChinaInstitute for Quantum Information & State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, ChinaInstitute for Quantum Information & State Key Laboratory of High Performance Computing, National University of Defense Technology, Changsha 410073, ChinaEmerging resistive random-access memory (ReRAM) has demonstrated great potential in the achievement of the in-memory computing paradigm to overcome the well-known “memory wall” in current von Neumann architecture. The ReRAM crossbar array (RCA) is a promising circuit structure to accelerate the vital multiplication-and-accumulation (MAC) operations in deep neural networks (DNN). However, due to the nonlinear distribution of conductance levels in ReRAM, a large deviation exists in the mapping process when the trained weights that are quantized by linear relationships are directly mapped to the nonlinear conductance values from the realistic ReRAM device. This deviation degrades the inference accuracy of the RCA-based DNN. In this paper, we propose a minimum error substitution based on a conductance-aware quantization method to eliminate the deviation in the mapping process from the weights to the actual conductance values. The method is suitable for multiple ReRAM devices with different non-linear conductance distribution and is also immune to the device variation. The simulation results on LeNet5, AlexNet and VGG16 demonstrate that this method can vastly rescue the accuracy degradation from the non-linear resistance distribution of ReRAM devices compared to the linear quantization method.https://www.mdpi.com/2072-666X/13/5/667ReRAMnon-linear conductance levelsconductance-aware quantization |
spellingShingle | Chenglong Huang Nuo Xu Wenqing Wang Yihong Hu Liang Fang Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems Micromachines ReRAM non-linear conductance levels conductance-aware quantization |
title | Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems |
title_full | Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems |
title_fullStr | Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems |
title_full_unstemmed | Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems |
title_short | Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems |
title_sort | conductance aware quantization based on minimum error substitution for non linear conductance state tolerance in neural computing systems |
topic | ReRAM non-linear conductance levels conductance-aware quantization |
url | https://www.mdpi.com/2072-666X/13/5/667 |
work_keys_str_mv | AT chenglonghuang conductanceawarequantizationbasedonminimumerrorsubstitutionfornonlinearconductancestatetoleranceinneuralcomputingsystems AT nuoxu conductanceawarequantizationbasedonminimumerrorsubstitutionfornonlinearconductancestatetoleranceinneuralcomputingsystems AT wenqingwang conductanceawarequantizationbasedonminimumerrorsubstitutionfornonlinearconductancestatetoleranceinneuralcomputingsystems AT yihonghu conductanceawarequantizationbasedonminimumerrorsubstitutionfornonlinearconductancestatetoleranceinneuralcomputingsystems AT liangfang conductanceawarequantizationbasedonminimumerrorsubstitutionfornonlinearconductancestatetoleranceinneuralcomputingsystems |