Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-c...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Journal on Exploratory Solid-State Computational Devices and Circuits |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9350264/ |
_version_ | 1798014841148932096 |
---|---|
author | Wonbo Shim Shimeng Yu |
author_facet | Wonbo Shim Shimeng Yu |
author_sort | Wonbo Shim |
collection | DOAJ |
description | Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm<sup>2</sup> area with 45% array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training. |
first_indexed | 2024-04-11T15:24:45Z |
format | Article |
id | doaj.art-72c080190bf2417d8ed2e07a85dc9aa4 |
institution | Directory Open Access Journal |
issn | 2329-9231 |
language | English |
last_indexed | 2024-04-11T15:24:45Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Journal on Exploratory Solid-State Computational Devices and Circuits |
spelling | doaj.art-72c080190bf2417d8ed2e07a85dc9aa42022-12-22T04:16:17ZengIEEEIEEE Journal on Exploratory Solid-State Computational Devices and Circuits2329-92312021-01-01711910.1109/JXCDC.2021.30578569350264Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training AcceleratorWonbo Shim0https://orcid.org/0000-0002-9669-7310Shimeng Yu1https://orcid.org/0000-0002-0068-3652School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USASchool of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USADifferent from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm<sup>2</sup> area with 45% array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training.https://ieeexplore.ieee.org/document/9350264/3-D NANDcompute-in-memory (CIM)deep neural network (DNN)ferroelectric transistoron-chip training accelerator |
spellingShingle | Wonbo Shim Shimeng Yu Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator IEEE Journal on Exploratory Solid-State Computational Devices and Circuits 3-D NAND compute-in-memory (CIM) deep neural network (DNN) ferroelectric transistor on-chip training accelerator |
title | Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator |
title_full | Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator |
title_fullStr | Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator |
title_full_unstemmed | Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator |
title_short | Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator |
title_sort | ferroelectric field effect transistor based 3 d nand architecture for energy efficient on chip training accelerator |
topic | 3-D NAND compute-in-memory (CIM) deep neural network (DNN) ferroelectric transistor on-chip training accelerator |
url | https://ieeexplore.ieee.org/document/9350264/ |
work_keys_str_mv | AT wonboshim ferroelectricfieldeffecttransistorbased3dnandarchitectureforenergyefficientonchiptrainingaccelerator AT shimengyu ferroelectricfieldeffecttransistorbased3dnandarchitectureforenergyefficientonchiptrainingaccelerator |