Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator

Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-c...

Full description

Bibliographic Details
Main Authors: Wonbo Shim, Shimeng Yu
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9350264/
_version_ 1798014841148932096
author Wonbo Shim
Shimeng Yu
author_facet Wonbo Shim
Shimeng Yu
author_sort Wonbo Shim
collection DOAJ
description Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm<sup>2</sup> area with 45&#x0025; array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training.
first_indexed 2024-04-11T15:24:45Z
format Article
id doaj.art-72c080190bf2417d8ed2e07a85dc9aa4
institution Directory Open Access Journal
issn 2329-9231
language English
last_indexed 2024-04-11T15:24:45Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
spelling doaj.art-72c080190bf2417d8ed2e07a85dc9aa42022-12-22T04:16:17ZengIEEEIEEE Journal on Exploratory Solid-State Computational Devices and Circuits2329-92312021-01-01711910.1109/JXCDC.2021.30578569350264Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training AcceleratorWonbo Shim0https://orcid.org/0000-0002-9669-7310Shimeng Yu1https://orcid.org/0000-0002-0068-3652School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USASchool of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USADifferent from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm<sup>2</sup> area with 45&#x0025; array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training.https://ieeexplore.ieee.org/document/9350264/3-D NANDcompute-in-memory (CIM)deep neural network (DNN)ferroelectric transistoron-chip training accelerator
spellingShingle Wonbo Shim
Shimeng Yu
Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
3-D NAND
compute-in-memory (CIM)
deep neural network (DNN)
ferroelectric transistor
on-chip training accelerator
title Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
title_full Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
title_fullStr Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
title_full_unstemmed Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
title_short Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
title_sort ferroelectric field effect transistor based 3 d nand architecture for energy efficient on chip training accelerator
topic 3-D NAND
compute-in-memory (CIM)
deep neural network (DNN)
ferroelectric transistor
on-chip training accelerator
url https://ieeexplore.ieee.org/document/9350264/
work_keys_str_mv AT wonboshim ferroelectricfieldeffecttransistorbased3dnandarchitectureforenergyefficientonchiptrainingaccelerator
AT shimengyu ferroelectricfieldeffecttransistorbased3dnandarchitectureforenergyefficientonchiptrainingaccelerator