Fast convolutional neural networks on FPGAs with hls4ml
<jats:title>Abstract</jats:title> <jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monos...
Main Authors: | , , , , , , , , , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2022
|
Online Access: | https://hdl.handle.net/1721.1/142113 |
_version_ | 1811074561748238336 |
---|---|
author | Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc |
author2 | Massachusetts Institute of Technology. Department of Physics |
author_facet | Massachusetts Institute of Technology. Department of Physics Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc |
author_sort | Aarrestad, Thea |
collection | MIT |
description | <jats:title>Abstract</jats:title>
<jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monospace> library, we demonstrate an inference latency of 5 <jats:italic>µ</jats:italic>s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.</jats:p> |
first_indexed | 2024-09-23T09:51:45Z |
format | Article |
id | mit-1721.1/142113 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T09:51:45Z |
publishDate | 2022 |
publisher | IOP Publishing |
record_format | dspace |
spelling | mit-1721.1/1421132023-04-14T18:29:13Z Fast convolutional neural networks on FPGAs with hls4ml Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc Massachusetts Institute of Technology. Department of Physics <jats:title>Abstract</jats:title> <jats:p>We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on field-programmable gate arrays (FPGAs). By extending the <jats:monospace>hls4ml</jats:monospace> library, we demonstrate an inference latency of 5 <jats:italic>µ</jats:italic>s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.</jats:p> 2022-04-26T18:31:03Z 2022-04-26T18:31:03Z 2021 2022-04-26T18:26:14Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/142113 Aarrestad, Thea, Loncar, Vladimir, Ghielmetti, Nicolò, Pierini, Maurizio, Summers, Sioni et al. 2021. "Fast convolutional neural networks on FPGAs with hls4ml." Machine Learning: Science and Technology, 2 (4). en 10.1088/2632-2153/AC0EA1 Machine Learning: Science and Technology Creative Commons Attribution 4.0 International license https://creativecommons.org/licenses/by/4.0/ application/pdf IOP Publishing IOP Publishing |
spellingShingle | Aarrestad, Thea Loncar, Vladimir Ghielmetti, Nicolò Pierini, Maurizio Summers, Sioni Ngadiuba, Jennifer Petersson, Christoffer Linander, Hampus Iiyama, Yutaro Di Guglielmo, Giuseppe Duarte, Javier Harris, Philip Rankin, Dylan Jindariani, Sergo Pedro, Kevin Tran, Nhan Liu, Mia Kreinar, Edward Wu, Zhenbin Hoang, Duc Fast convolutional neural networks on FPGAs with hls4ml |
title | Fast convolutional neural networks on FPGAs with hls4ml |
title_full | Fast convolutional neural networks on FPGAs with hls4ml |
title_fullStr | Fast convolutional neural networks on FPGAs with hls4ml |
title_full_unstemmed | Fast convolutional neural networks on FPGAs with hls4ml |
title_short | Fast convolutional neural networks on FPGAs with hls4ml |
title_sort | fast convolutional neural networks on fpgas with hls4ml |
url | https://hdl.handle.net/1721.1/142113 |
work_keys_str_mv | AT aarrestadthea fastconvolutionalneuralnetworksonfpgaswithhls4ml AT loncarvladimir fastconvolutionalneuralnetworksonfpgaswithhls4ml AT ghielmettinicolo fastconvolutionalneuralnetworksonfpgaswithhls4ml AT pierinimaurizio fastconvolutionalneuralnetworksonfpgaswithhls4ml AT summerssioni fastconvolutionalneuralnetworksonfpgaswithhls4ml AT ngadiubajennifer fastconvolutionalneuralnetworksonfpgaswithhls4ml AT peterssonchristoffer fastconvolutionalneuralnetworksonfpgaswithhls4ml AT linanderhampus fastconvolutionalneuralnetworksonfpgaswithhls4ml AT iiyamayutaro fastconvolutionalneuralnetworksonfpgaswithhls4ml AT diguglielmogiuseppe fastconvolutionalneuralnetworksonfpgaswithhls4ml AT duartejavier fastconvolutionalneuralnetworksonfpgaswithhls4ml AT harrisphilip fastconvolutionalneuralnetworksonfpgaswithhls4ml AT rankindylan fastconvolutionalneuralnetworksonfpgaswithhls4ml AT jindarianisergo fastconvolutionalneuralnetworksonfpgaswithhls4ml AT pedrokevin fastconvolutionalneuralnetworksonfpgaswithhls4ml AT trannhan fastconvolutionalneuralnetworksonfpgaswithhls4ml AT liumia fastconvolutionalneuralnetworksonfpgaswithhls4ml AT kreinaredward fastconvolutionalneuralnetworksonfpgaswithhls4ml AT wuzhenbin fastconvolutionalneuralnetworksonfpgaswithhls4ml AT hoangduc fastconvolutionalneuralnetworksonfpgaswithhls4ml |