Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis

This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a si...

Full description

Bibliographic Details
Main Authors: Minhun Lee, Hansol Choi, Jihoon Kim, Jihoon Do, Hyoungjin Kwon, Se Yoon Jeong, Donggyu Sim, Seoung-Jun Oh
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10064277/
_version_ 1797856485536956416
author Minhun Lee
Hansol Choi
Jihoon Kim
Jihoon Do
Hyoungjin Kwon
Se Yoon Jeong
Donggyu Sim
Seoung-Jun Oh
author_facet Minhun Lee
Hansol Choi
Jihoon Kim
Jihoon Do
Hyoungjin Kwon
Se Yoon Jeong
Donggyu Sim
Seoung-Jun Oh
author_sort Minhun Lee
collection DOAJ
description This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30&#x0025; with a loss of 8.30&#x0025; in the average precision (AP) on the OpenImageV6 dataset and 0.77&#x0025; in <inline-formula> <tex-math notation="LaTeX">$AP_{M}$ </tex-math></inline-formula> and 0.47&#x0025; in <inline-formula> <tex-math notation="LaTeX">$AP_{L}$ </tex-math></inline-formula> on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates.
first_indexed 2024-04-09T20:41:12Z
format Article
id doaj.art-6df67ae77e6245adb4f69ed7b4aed9b8
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-09T20:41:12Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-6df67ae77e6245adb4f69ed7b4aed9b82023-03-29T23:00:19ZengIEEEIEEE Access2169-35362023-01-0111263082631910.1109/ACCESS.2023.325458910064277Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component AnalysisMinhun Lee0https://orcid.org/0000-0001-8165-5380Hansol Choi1Jihoon Kim2Jihoon Do3https://orcid.org/0000-0002-8254-2481Hyoungjin Kwon4Se Yoon Jeong5https://orcid.org/0000-0002-1675-4814Donggyu Sim6https://orcid.org/0000-0002-2794-9932Seoung-Jun Oh7https://orcid.org/0000-0002-7249-3647Department of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Electronic Engineering, Kwangwoon University, Seoul, South KoreaThis paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30&#x0025; with a loss of 8.30&#x0025; in the average precision (AP) on the OpenImageV6 dataset and 0.77&#x0025; in <inline-formula> <tex-math notation="LaTeX">$AP_{M}$ </tex-math></inline-formula> and 0.47&#x0025; in <inline-formula> <tex-math notation="LaTeX">$AP_{L}$ </tex-math></inline-formula> on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates.https://ieeexplore.ieee.org/document/10064277/Moving picture experts groupvideo coding for machinesconvolutional neural networkprincipal component analysisfeature map compression
spellingShingle Minhun Lee
Hansol Choi
Jihoon Kim
Jihoon Do
Hyoungjin Kwon
Se Yoon Jeong
Donggyu Sim
Seoung-Jun Oh
Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
IEEE Access
Moving picture experts group
video coding for machines
convolutional neural network
principal component analysis
feature map compression
title Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_full Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_fullStr Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_full_unstemmed Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_short Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_sort feature map compression for video coding for machines based on receptive block based principal component analysis
topic Moving picture experts group
video coding for machines
convolutional neural network
principal component analysis
feature map compression
url https://ieeexplore.ieee.org/document/10064277/
work_keys_str_mv AT minhunlee featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT hansolchoi featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT jihoonkim featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT jihoondo featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT hyoungjinkwon featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT seyoonjeong featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT donggyusim featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis
AT seoungjunoh featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis