Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable...

Full description

Bibliographic Details
Main Authors:	Robert Stewart, Andrew Nowlan, Pascal Bacchus, Quentin Ducasse, Ekaterina Komendantskaya
Format:	Article
Language:	English
Published:	MDPI AG 2021-02-01
Series:	Electronics
Subjects:	quantisation evolutionary algorithm neural network FPGA Movidius VPU
Online Access:	https://www.mdpi.com/2079-9292/10/4/396

_version_	1797413933685932032
author	Robert Stewart Andrew Nowlan Pascal Bacchus Quentin Ducasse Ekaterina Komendantskaya
author_facet	Robert Stewart Andrew Nowlan Pascal Bacchus Quentin Ducasse Ekaterina Komendantskaya
author_sort	Robert Stewart
collection	DOAJ
description	This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>×</mo><mrow></mrow></mrow></semantics></math></inline-formula> speedup.
first_indexed	2024-03-09T05:24:53Z
format	Article
id	doaj.art-06d422cf92584c8998d75cf91b9ca81d
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-09T05:24:53Z
publishDate	2021-02-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-06d422cf92584c8998d75cf91b9ca81d2023-12-03T12:36:58ZengMDPI AGElectronics2079-92922021-02-0110439610.3390/electronics10040396Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary AlgorithmRobert Stewart0Andrew Nowlan1Pascal Bacchus2Quentin Ducasse3Ekaterina Komendantskaya4Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKMathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKInria Rennes-Bretagne Altlantique Research Centre, 35042 Rennes, FranceLab-STICC, École Nationale Supérieure de Techniques Avancées, 29200 Brest, FranceMathematical and Computer Sciences, Heriot-Watt University, Edinburgh, EH14 4AS, UKThis paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughput performance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo>×</mo><mrow></mrow></mrow></semantics></math></inline-formula> speedup.https://www.mdpi.com/2079-9292/10/4/396quantisationevolutionary algorithmneural networkFPGAMovidius VPU
spellingShingle	Robert Stewart Andrew Nowlan Pascal Bacchus Quentin Ducasse Ekaterina Komendantskaya Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm Electronics quantisation evolutionary algorithm neural network FPGA Movidius VPU
title	Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_full	Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_fullStr	Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_full_unstemmed	Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_short	Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm
title_sort	optimising hardware accelerated neural networks with quantisation and a knowledge distillation evolutionary algorithm
topic	quantisation evolutionary algorithm neural network FPGA Movidius VPU
url	https://www.mdpi.com/2079-9292/10/4/396
work_keys_str_mv	AT robertstewart optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT andrewnowlan optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT pascalbacchus optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT quentinducasse optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm AT ekaterinakomendantskaya optimisinghardwareacceleratedneuralnetworkswithquantisationandaknowledgedistillationevolutionaryalgorithm

Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

Similar Items