TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation
Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a h...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9389774/ |
_version_ | 1819142923487281152 |
---|---|
author | Dilshad Sabir Muhammmad Abdullah Hanif Ali Hassan Saad Rehman Muhammad Shafique |
author_facet | Dilshad Sabir Muhammmad Abdullah Hanif Ali Hassan Saad Rehman Muhammad Shafique |
author_sort | Dilshad Sabir |
collection | DOAJ |
description | Convolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a <italic>Particle of Swarm Convolution Layer Optimization (PSCLO)</italic> algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as <italic>symmetry approximation</italic> and Winograd algorithm structure termed as <italic>tile quantization approximation</italic>. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation’s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved <inline-formula> <tex-math notation="LaTeX">$\sim 5.28\text{x}$ </tex-math></inline-formula> multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is <inline-formula> <tex-math notation="LaTeX">$\sim 1.08\text{x}$ </tex-math></inline-formula> less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, <inline-formula> <tex-math notation="LaTeX">$\sim 3.87\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 3.93\text{x}$ </tex-math></inline-formula> was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was <inline-formula> <tex-math notation="LaTeX">$\sim 2.5\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 2.56\text{x}$ </tex-math></inline-formula> for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset. |
first_indexed | 2024-12-22T12:18:03Z |
format | Article |
id | doaj.art-2e1dac7b16aa46ca9801149bad935728 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T12:18:03Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-2e1dac7b16aa46ca9801149bad9357282022-12-21T18:26:05ZengIEEEIEEE Access2169-35362021-01-019536475366810.1109/ACCESS.2021.30699069389774TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry ApproximationDilshad Sabir0https://orcid.org/0000-0002-5322-9808Muhammmad Abdullah Hanif1https://orcid.org/0000-0001-9841-6132Ali Hassan2Saad Rehman3https://orcid.org/0000-0002-0487-0703Muhammad Shafique4https://orcid.org/0000-0002-2607-8135Department of Computer and Software Engineering, College of Electrical and Mechanical Engineering (E&ME), National University of Sciences and Technology, Islamabad, PakistanInstitute of Computer Engineering, Technische Universität Wien (TU Wien), Vienna, AustriaDepartment of Computer and Software Engineering, College of Electrical and Mechanical Engineering (E&ME), National University of Sciences and Technology, Islamabad, PakistanFaculty of Computer Engineering, HITEC University, Taxila, PakistanDivision of Engineering, New York University Abu Dhabi (NYU AD), Abu Dhabi, United Arab EmiratesConvolutional Neural Networks (CNNs) in the Internet-of-Things (IoT)-based applications face stringent constraints, like limited memory capacity and energy resources due to many computations in convolution layers. In order to reduce the computational workload in these layers, this paper proposes a hybrid convolution method in conjunction with a <italic>Particle of Swarm Convolution Layer Optimization (PSCLO)</italic> algorithm. The hybrid convolution is an approximation that exploits the inherent symmetry of filter termed as <italic>symmetry approximation</italic> and Winograd algorithm structure termed as <italic>tile quantization approximation</italic>. PSCLO optimizes the balance between workload reduction and accuracy degradation for each convolution layer by selecting fine-tuned thresholds to control each approximation’s intensity. The proposed methods have been evaluated on ImageNet, MNIST, Fashion-MNIST, SVHN, and CIFAR-10 datasets. The proposed techniques achieved <inline-formula> <tex-math notation="LaTeX">$\sim 5.28\text{x}$ </tex-math></inline-formula> multiplicative workload reduction without significant accuracy degradation (<0.1%) for ImageNet on ResNet-18, which is <inline-formula> <tex-math notation="LaTeX">$\sim 1.08\text{x}$ </tex-math></inline-formula> less multiplicative workload as compared to state-of-the-art Winograd CNN pruning. For LeNet, <inline-formula> <tex-math notation="LaTeX">$\sim 3.87\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 3.93\text{x}$ </tex-math></inline-formula> was the multiplicative workload reduction for MNIST and Fashion-MNIST datasets. The additive workload reduction was <inline-formula> <tex-math notation="LaTeX">$\sim 2.5\text{x}$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$\sim 2.56\text{x}$ </tex-math></inline-formula> for the respective datasets. There is no significant accuracy loss for MNIST and Fashion-MNIST dataset.https://ieeexplore.ieee.org/document/9389774/Convolutional neural networkreduced workloadwinograd transformparticle of swarm convolution layer optimizationsymmetry approximationtile quantization approximation |
spellingShingle | Dilshad Sabir Muhammmad Abdullah Hanif Ali Hassan Saad Rehman Muhammad Shafique TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation IEEE Access Convolutional neural network reduced workload winograd transform particle of swarm convolution layer optimization symmetry approximation tile quantization approximation |
title | TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation |
title_full | TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation |
title_fullStr | TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation |
title_full_unstemmed | TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation |
title_short | TiQSA: Workload Minimization in Convolutional Neural Networks Using Tile Quantization and Symmetry Approximation |
title_sort | tiqsa workload minimization in convolutional neural networks using tile quantization and symmetry approximation |
topic | Convolutional neural network reduced workload winograd transform particle of swarm convolution layer optimization symmetry approximation tile quantization approximation |
url | https://ieeexplore.ieee.org/document/9389774/ |
work_keys_str_mv | AT dilshadsabir tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT muhammmadabdullahhanif tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT alihassan tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT saadrehman tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation AT muhammadshafique tiqsaworkloadminimizationinconvolutionalneuralnetworksusingtilequantizationandsymmetryapproximation |