Hardware for Deep Learning Acceleration

Deep learning (DL) has proven to be one of the most pivotal components of machine learning given its notable performance in a variety of application domains. Neural networks (NNs) for DL are tailored to specific application domains by varying in their topology and activation nodes. Nevertheless, the...

ver descrição completa

Detalhes bibliográficos
Main Authors: Choongseok Song, ChangMin Ye, Yonguk Sim, Doo Seok Jeong
Formato: Artigo
Idioma:English
Publicado em: Wiley 2024-10-01
Colecção:Advanced Intelligent Systems
Assuntos:
Acesso em linha:https://doi.org/10.1002/aisy.202300762
Descrição
Resumo:Deep learning (DL) has proven to be one of the most pivotal components of machine learning given its notable performance in a variety of application domains. Neural networks (NNs) for DL are tailored to specific application domains by varying in their topology and activation nodes. Nevertheless, the major operation type (with the largest computational complexity) is commonly multiply‐accumulate operation irrespective of their topology. Recent trends in DL highlight the evolution of NNs such that they become deeper and larger, and thus their prohibitive computational complexity. To cope with the consequent prohibitive latency for computation, 1) general‐purpose hardware, e.g., central processing units and graphics processing units, has been redesigned, and 2) various DL accelerators have been newly introduced, e.g., neural processing units, and computing‐in‐memory units for deep NN‐based DL, and neuromorphic processors for spiking NN‐based DL. In this review, these accelerators and their pros and cons are overviewed with particular focus on their performance and memory bandwidth.
ISSN:2640-4567