Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-02-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/10/4/396 |
_version_ | 1797413933685932032 |
---|---|
author | Robert Stewart Andrew Nowlan Pascal Bacchus Quentin Ducasse Ekaterina Komendantskaya |
author_facet | Robert Stewart Andrew Nowlan Pascal Bacchus Quentin Ducasse Ekaterina Komendantskaya |
author_sort | Robert Stewart |
collection | DOAJ |
description | This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>×</mo><mrow></mrow></mrow></semantics></math></inline-formula> speedup. |
first_indexed | 2024-03-09T05:24:53Z |
format | Article |
id | doaj.art-06d422cf92584c8998d75cf91b9ca81d |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-09T05:24:53Z |
publishDate | 2021-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-06d422cf92584c8998d75cf91b9ca81d2023-12-03T12:36:58ZengMDPI AGElectronics2079-92922021-02-0110439610.3390/electronics10040396Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary AlgorithmRobert Stewart0Andrew Nowlan1Pascal Bacchus2Quentin Ducasse3Ekaterina Komendantskaya4Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKMathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKInria Rennes-Bretagne Altlantique Research Centre, 35042 Rennes, FranceLab-STICC, École Nationale Supérieure de Techniques Avancées, 29200 Brest, FranceMathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKThis paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>×</mo><mrow></mrow></mrow></semantics></math></inline-formula> speedup.https://www.mdpi.com/2079-9292/10/4/396quantisationevolutionary algorithmneural networkFPGAMovidius VPU |
spellingShingle | Robert Stewart Andrew Nowlan Pascal Bacchus Quentin Ducasse Ekaterina Komendantskaya Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm Electronics quantisation evolutionary algorithm neural network FPGA Movidius VPU |
title | Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm |
title_full | Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm |
title_fullStr | Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm |
title_full_unstemmed | Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm |
title_short | Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm |
title_sort | optimising hardware accelerated neural networks with quantisation and a knowledge distillation evolutionary algorithm |
topic | quantisation evolutionary algorithm neural network FPGA Movidius VPU |
url | https://www.mdpi.com/2079-9292/10/4/396 |
work_keys_str_mv | AT robertstewart optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT andrewnowlan optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT pascalbacchus optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT quentinducasse optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT ekaterinakomendantskaya optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm |