Fast convolutional neural networks on FPGAs with hls4ml

<jats:title>Abstract</jats:title> <jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monos...

Full description

Bibliographic Details
Main Authors:	Aarrestad, Thea, Loncar, Vladimir, Ghielmetti, Nicolò, Pierini, Maurizio, Summers, Sioni, Ngadiuba, Jennifer, Petersson, Christoffer, Linander, Hampus, Iiyama, Yutaro, Di Guglielmo, Giuseppe, Duarte, Javier, Harris, Philip, Rankin, Dylan, Jindariani, Sergo, Pedro, Kevin, Tran, Nhan, Liu, Mia, Kreinar, Edward, Wu, Zhenbin, Hoang, Duc
Other Authors:	Massachusetts Institute of Technology. Department of Physics
Format:	Article
Language:	English
Published:	IOP Publishing 2022
Online Access:	https://hdl.handle.net/1721.1/142113

_version_	1811074561748238336
author	Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc
author2	Massachusetts Institute of Technology. Department of Physics
author_facet	Massachusetts Institute of Technology. Department of Physics Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc
author_sort	Aarrestad, Thea
collection	MIT
description	<jats:title>Abstract</jats:title> <jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monospace> library, we demonstrate an inference latency of 5 <jats:italic>µ</jats:italic>s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.</jats:p>
first_indexed	2024-09-23T09:51:45Z
format	Article
id	mit-1721.1/142113
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T09:51:45Z
publishDate	2022
publisher	IOP Publishing
record_format	dspace
spelling	mit-1721.1/1421132023-04-14T18:29:13Z Fast convolutional neural networks on FPGAs with hls4ml Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc Massachusetts Institute of Technology. Department of Physics <jats:title>Abstract</jats:title> <jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monospace> library, we demonstrate an inference latency of 5 <jats:italic>µ</jats:italic>s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.</jats:p> 2022-04-26T18:31:03Z 2022-04-26T18:31:03Z 2021 2022-04-26T18:26:14Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/142113 Aarrestad, Thea, Loncar, Vladimir, Ghielmetti, Nicolò, Pierini, Maurizio, Summers, Sioni et al. 2021. "Fast convolutional neural networks on FPGAs with hls4ml." Machine Learning: Science and Technology, 2 (4). en 10.1088/2632-2153/AC0EA1 Machine Learning: Science and Technology Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf IOP Publishing IOP Publishing
spellingShingle	Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc Fast convolutional neural networks on FPGAs with hls4ml
title	Fast convolutional neural networks on FPGAs with hls4ml
title_full	Fast convolutional neural networks on FPGAs with hls4ml
title_fullStr	Fast convolutional neural networks on FPGAs with hls4ml
title_full_unstemmed	Fast convolutional neural networks on FPGAs with hls4ml
title_short	Fast convolutional neural networks on FPGAs with hls4ml
title_sort	fast convolutional neural networks on fpgas with hls4ml
url	https://hdl.handle.net/1721.1/142113
work_keys_str_mv	AT aarrestadthea fastconvolutionalneuralnetworksonfpgaswithhls4ml AT loncarvladimir fastconvolutionalneuralnetworksonfpgaswithhls4ml AT ghielmettinicolo fastconvolutionalneuralnetworksonfpgaswithhls4ml AT pierinimaurizio fastconvolutionalneuralnetworksonfpgaswithhls4ml AT summerssioni fastconvolutionalneuralnetworksonfpgaswithhls4ml AT ngadiubajennifer fastconvolutionalneuralnetworksonfpgaswithhls4ml AT peterssonchristoffer fastconvolutionalneuralnetworksonfpgaswithhls4ml AT linanderhampus fastconvolutionalneuralnetworksonfpgaswithhls4ml AT iiyamayutaro fastconvolutionalneuralnetworksonfpgaswithhls4ml AT diguglielmogiuseppe fastconvolutionalneuralnetworksonfpgaswithhls4ml AT duartejavier fastconvolutionalneuralnetworksonfpgaswithhls4ml AT harrisphilip fastconvolutionalneuralnetworksonfpgaswithhls4ml AT rankindylan fastconvolutionalneuralnetworksonfpgaswithhls4ml AT jindarianisergo fastconvolutionalneuralnetworksonfpgaswithhls4ml AT pedrokevin fastconvolutionalneuralnetworksonfpgaswithhls4ml AT trannhan fastconvolutionalneuralnetworksonfpgaswithhls4ml AT liumia fastconvolutionalneuralnetworksonfpgaswithhls4ml AT kreinaredward fastconvolutionalneuralnetworksonfpgaswithhls4ml AT wuzhenbin fastconvolutionalneuralnetworksonfpgaswithhls4ml AT hoangduc fastconvolutionalneuralnetworksonfpgaswithhls4ml

Fast convolutional neural networks on FPGAs with hls4ml

Similar Items