A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network
It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power...
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Journal Article |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 |
_version_ | 1826121872136208384 |
---|---|
author | Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao |
author2 | School of Electrical and Electronic Engineering |
author_facet | School of Electrical and Electronic Engineering Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao |
author_sort | Huang, Hantao |
collection | NTU |
description | It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss. |
first_indexed | 2024-10-01T05:39:16Z |
format | Journal Article |
id | ntu-10356/87049 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T05:39:16Z |
publishDate | 2018 |
record_format | dspace |
spelling | ntu-10356/870492020-03-07T13:56:07Z A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao School of Electrical and Electronic Engineering Tensorized Neural Network (TNN) RRAM Computing It is a grand challenge to develop highly parallel yet energy-efficient machine learning hardware accelerator. This paper introduces a three-dimensional (3-D) multilayer CMOSRRAM accelerator for atensorized neural network. Highly parallel matrix-vector multiplication can be performed with low power in the proposed 3-D multilayer CMOS-RRAM accelerator. The adoption of tensorization can significantly compress the weight matrix of a neural network using much fewer parameters. Simulation results using the benchmark MNIST show that the proposed accelerator has 1.283× speed-up, 4.276× energy-saving, and 9.339× area-saving compared to the 3-D CMOS-ASIC implementation; and 6.37× speed-up and 2612× energy-saving compared to 2-D CPU implementation. In addition, 14.85× model compression can be achieved by tensorization with acceptable accuracy loss. NRF (Natl Research Foundation, S’pore) MOE (Min. of Education, S’pore) Accepted version 2018-07-25T04:45:00Z 2019-12-06T16:34:01Z 2018-07-25T04:45:00Z 2019-12-06T16:34:01Z 2018 Journal Article Huang, H., Ni, L., Wang, K., Wang, Y., & Yu, H. (2018). A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network. IEEE Transactions on Nanotechnology, 17(4), 645-656. 1536-125X https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 10.1109/TNANO.2017.2732698 en IEEE Transactions on Nanotechnology © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TNANO.2017.2732698]. 12 p. application/pdf |
spellingShingle | Tensorized Neural Network (TNN) RRAM Computing Huang, Hantao Ni, Leibin Wang, Kanwen Wang, Yuangang Yu, Hao A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title | A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_full | A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_fullStr | A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_full_unstemmed | A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_short | A highly-parallel and energy-efficient 3D multi-layer CMOS-RRAM accelerator for tensorized neural network |
title_sort | highly parallel and energy efficient 3d multi layer cmos rram accelerator for tensorized neural network |
topic | Tensorized Neural Network (TNN) RRAM Computing |
url | https://hdl.handle.net/10356/87049 http://hdl.handle.net/10220/45222 |
work_keys_str_mv | AT huanghantao ahighlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT nileibin ahighlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT wangkanwen ahighlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT wangyuangang ahighlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT yuhao ahighlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT huanghantao highlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT nileibin highlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT wangkanwen highlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT wangyuangang highlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork AT yuhao highlyparallelandenergyefficient3dmultilayercmosrramacceleratorfortensorizedneuralnetwork |