Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable...

Full description

Bibliographic Details
Main Authors: Robert Stewart, Andrew Nowlan, Pascal Bacchus, Quentin Ducasse, Ekaterina Komendantskaya
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/10/4/396
_version_ 1797413933685932032
author Robert Stewart
Andrew Nowlan
Pascal Bacchus
Quentin Ducasse
Ekaterina Komendantskaya
author_facet Robert Stewart
Andrew Nowlan
Pascal Bacchus
Quentin Ducasse
Ekaterina Komendantskaya
author_sort Robert Stewart
collection DOAJ
description This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>×</mo><mrow></mrow></mrow></semantics></math></inline-formula> speedup.
first_indexed 2024-03-09T05:24:53Z
format Article
id doaj.art-06d422cf92584c8998d75cf91b9ca81d
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-09T05:24:53Z
publishDate 2021-02-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-06d422cf92584c8998d75cf91b9ca81d2023-12-03T12:36:58ZengMDPI AGElectronics2079-92922021-02-0110439610.3390/electronics10040396Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary AlgorithmRobert Stewart0Andrew Nowlan1Pascal Bacchus2Quentin Ducasse3Ekaterina Komendantskaya4Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKMathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKInria Rennes-Bretagne Altlantique Research Centre, 35042 Rennes, FranceLab-STICC, École Nationale Supérieure de Techniques Avancées, 29200 Brest, FranceMathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKThis paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>×</mo><mrow></mrow></mrow></semantics></math></inline-formula> speedup.https://www.mdpi.com/2079-9292/10/4/396quantisationevolutionary algorithmneural networkFPGAMovidius VPU
spellingShingle Robert Stewart
Andrew Nowlan
Pascal Bacchus
Quentin Ducasse
Ekaterina Komendantskaya
Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
Electronics
quantisation
evolutionary algorithm
neural network
FPGA
Movidius VPU
title Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_full Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_fullStr Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_full_unstemmed Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_short Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_sort optimising hardware accelerated neural networks with quantisation and a knowledge distillation evolutionary algorithm
topic quantisation
evolutionary algorithm
neural network
FPGA
Movidius VPU
url https://www.mdpi.com/2079-9292/10/4/396
work_keys_str_mv AT robertstewart optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm
AT andrewnowlan optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm
AT pascalbacchus optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm
AT quentinducasse optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm
AT ekaterinakomendantskaya optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm