TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation

Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a h...

Full description

Bibliographic Details
Main Authors: Dilshad Sabir, Muhammmad Abdullah Hanif, Ali Hassan, Saad Rehman, Muhammad Shafique
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9389774/
_version_ 1819142923487281152
author Dilshad Sabir
Muhammmad Abdullah Hanif
Ali Hassan
Saad Rehman
Muhammad Shafique
author_facet Dilshad Sabir
Muhammmad Abdullah Hanif
Ali Hassan
Saad Rehman
Muhammad Shafique
author_sort Dilshad Sabir
collection DOAJ
description Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a <italic>Particle of Swarm Convolution Layer Optimization (PSCLO)</italic> algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as <italic>symmetry approximation</italic> and Winograd algorithm structure termed as <italic>tile quantization approximation</italic>. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation&#x2019;s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved <inline-formula> <tex-math notation="LaTeX">$\sim 5.28\text{x}$ </tex-math></inline-formula> multiplicative workload reduction without significant accuracy degradation (&#x003C;0.1&#x0025;) for ImageNet on ResNet-18, which is <inline-formula> <tex-math notation="LaTeX">$\sim 1.08\text{x}$ </tex-math></inline-formula> less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, <inline-formula> <tex-math notation="LaTeX">$\sim 3.87\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 3.93\text{x}$ </tex-math></inline-formula> was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was <inline-formula> <tex-math notation="LaTeX">$\sim 2.5\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 2.56\text{x}$ </tex-math></inline-formula> for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.
first_indexed 2024-12-22T12:18:03Z
format Article
id doaj.art-2e1dac7b16aa46ca9801149bad935728
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T12:18:03Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-2e1dac7b16aa46ca9801149bad9357282022-12-21T18:26:05ZengIEEEIEEE Access2169-35362021-01-019536475366810.1109/ACCESS.2021.30699069389774TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry ApproximationDilshad Sabir0https://orcid.org/0000-0002-5322-9808Muhammmad Abdullah Hanif1https://orcid.org/0000-0001-9841-6132Ali Hassan2Saad Rehman3https://orcid.org/0000-0002-0487-0703Muhammad Shafique4https://orcid.org/0000-0002-2607-8135Department of Computer and Software Engineering, College of Electrical and Mechanical Engineering (E&#x0026;ME), National University of Sciences and Technology, Islamabad, PakistanInstitute of Computer Engineering, Technische Universit&#x00E4;t Wien (TU Wien), Vienna, AustriaDepartment of Computer and Software Engineering, College of Electrical and Mechanical Engineering (E&#x0026;ME), National University of Sciences and Technology, Islamabad, PakistanFaculty of Computer Engineering, HITEC University, Taxila, PakistanDivision of Engineering, New York University Abu Dhabi (NYU AD), Abu Dhabi, United Arab EmiratesConvolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a <italic>Particle of Swarm Convolution Layer Optimization (PSCLO)</italic> algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as <italic>symmetry approximation</italic> and Winograd algorithm structure termed as <italic>tile quantization approximation</italic>. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation&#x2019;s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved <inline-formula> <tex-math notation="LaTeX">$\sim 5.28\text{x}$ </tex-math></inline-formula> multiplicative workload reduction without significant accuracy degradation (&#x003C;0.1&#x0025;) for ImageNet on ResNet-18, which is <inline-formula> <tex-math notation="LaTeX">$\sim 1.08\text{x}$ </tex-math></inline-formula> less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, <inline-formula> <tex-math notation="LaTeX">$\sim 3.87\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 3.93\text{x}$ </tex-math></inline-formula> was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was <inline-formula> <tex-math notation="LaTeX">$\sim 2.5\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 2.56\text{x}$ </tex-math></inline-formula> for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.https://ieeexplore.ieee.org/document/9389774/Convolutional neural networkreduced workloadwinograd transformparticle of swarm convolution layer optimizationsymmetry approximationtile quantization approximation
spellingShingle Dilshad Sabir
Muhammmad Abdullah Hanif
Ali Hassan
Saad Rehman
Muhammad Shafique
TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
IEEE Access
Convolutional neural network
reduced workload
winograd transform
particle of swarm convolution layer optimization
symmetry approximation
tile quantization approximation
title TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_full TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_fullStr TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_full_unstemmed TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_short TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
title_sort tiqsa workload minimization in convolutional neural networks using tile quantization and symmetry approximation
topic Convolutional neural network
reduced workload
winograd transform
particle of swarm convolution layer optimization
symmetry approximation
tile quantization approximation
url https://ieeexplore.ieee.org/document/9389774/
work_keys_str_mv AT dilshadsabir tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation
AT muhammmadabdullahhanif tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation
AT alihassan tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation
AT saadrehman tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation
AT muhammadshafique tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation