TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation

Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a h...

Full description

Bibliographic Details
Main Authors:	Dilshad Sabir, Muhammmad Abdullah Hanif, Ali Hassan, Saad Rehman, Muhammad Shafique
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Convolutional neural network reduced workload winograd transform particle of swarm convolution layer optimization symmetry approximation tile quantization approximation
Online Access:	https://ieeexplore.ieee.org/document/9389774/

_version_	1819142923487281152
author	Dilshad Sabir Muhammmad Abdullah Hanif Ali Hassan Saad Rehman Muhammad Shafique
author_facet	Dilshad Sabir Muhammmad Abdullah Hanif Ali Hassan Saad Rehman Muhammad Shafique
author_sort	Dilshad Sabir
collection	DOAJ
description	Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a <italic>Particle of Swarm Convolution Layer Optimization (PSCLO)</italic> algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as <italic>symmetry approximation</italic> and Winograd algorithm structure termed as <italic>tile quantization approximation</italic>. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation’s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved <inline-formula> <tex-math notation="LaTeX">$\sim 5.28\text{x}$ </tex-math></inline-formula> multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is <inline-formula> <tex-math notation="LaTeX">$\sim 1.08\text{x}$ </tex-math></inline-formula> less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, <inline-formula> <tex-math notation="LaTeX">$\sim 3.87\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 3.93\text{x}$ </tex-math></inline-formula> was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was <inline-formula> <tex-math notation="LaTeX">$\sim 2.5\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 2.56\text{x}$ </tex-math></inline-formula> for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.
first_indexed	2024-12-22T12:18:03Z
format	Article
id	doaj.art-2e1dac7b16aa46ca9801149bad935728
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-22T12:18:03Z
publishDate	2021-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-2e1dac7b16aa46ca9801149bad9357282022-12-21T18:26:05ZengIEEEIEEE Access2169-35362021-01-019536475366810.1109/ACCESS.2021.30699069389774TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry ApproximationDilshad Sabir0https://orcid.org/0000-0002-5322-9808Muhammmad Abdullah Hanif1https://orcid.org/0000-0001-9841-6132Ali Hassan2Saad Rehman3https://orcid.org/0000-0002-0487-0703Muhammad Shafique4https://orcid.org/0000-0002-2607-8135Department of Computer and Software Engineering, College of Electrical and Mechanical Engineering (E&ME), National University of Sciences and Technology, Islamabad, PakistanInstitute of Computer Engineering, Technische Universität Wien (TU Wien), Vienna, AustriaDepartment of Computer and Software Engineering, College of Electrical and Mechanical Engineering (E&ME), National University of Sciences and Technology, Islamabad, PakistanFaculty of Computer Engineering, HITEC University, Taxila, PakistanDivision of Engineering, New York University Abu Dhabi (NYU AD), Abu Dhabi, United Arab EmiratesConvolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a <italic>Particle of Swarm Convolution Layer Optimization (PSCLO)</italic> algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as <italic>symmetry approximation</italic> and Winograd algorithm structure termed as <italic>tile quantization approximation</italic>. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation’s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved <inline-formula> <tex-math notation="LaTeX">$\sim 5.28\text{x}$ </tex-math></inline-formula> multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is <inline-formula> <tex-math notation="LaTeX">$\sim 1.08\text{x}$ </tex-math></inline-formula> less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, <inline-formula> <tex-math notation="LaTeX">$\sim 3.87\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 3.93\text{x}$ </tex-math></inline-formula> was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was <inline-formula> <tex-math notation="LaTeX">$\sim 2.5\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 2.56\text{x}$ </tex-math></inline-formula> for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.https://ieeexplore.ieee.org/document/9389774/Convolutional neural networkreduced workloadwinograd transformparticle of swarm convolution layer optimizationsymmetry approximationtile quantization approximation
spellingShingle	Dilshad Sabir Muhammmad Abdullah Hanif Ali Hassan Saad Rehman Muhammad Shafique TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation IEEE Access Convolutional neural network reduced workload winograd transform particle of swarm convolution layer optimization symmetry approximation tile quantization approximation
title	TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_full	TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_fullStr	TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_full_unstemmed	TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_short	TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_sort	tiqsa workload minimization in convolutional neural networks using tile quantization and symmetry approximation
topic	Convolutional neural network reduced workload winograd transform particle of swarm convolution layer optimization symmetry approximation tile quantization approximation
url	https://ieeexplore.ieee.org/document/9389774/
work_keys_str_mv	AT dilshadsabir tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT muhammmadabdullahhanif tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT alihassan tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT saadrehman tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT muhammadshafique tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation

TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation

Similar Items