Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis

This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a si...

Full description

Bibliographic Details
Main Authors:	Minhun Lee, Hansol Choi, Jihoon Kim, Jihoon Do, Hyoungjin Kwon, Se Yoon Jeong, Donggyu Sim, Seoung-Jun Oh
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Moving picture experts group video coding for machines convolutional neural network principal component analysis feature map compression
Online Access:	https://ieeexplore.ieee.org/document/10064277/

_version_	1797856485536956416
author	Minhun Lee Hansol Choi Jihoon Kim Jihoon Do Hyoungjin Kwon Se Yoon Jeong Donggyu Sim Seoung-Jun Oh
author_facet	Minhun Lee Hansol Choi Jihoon Kim Jihoon Do Hyoungjin Kwon Se Yoon Jeong Donggyu Sim Seoung-Jun Oh
author_sort	Minhun Lee
collection	DOAJ
description	This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30% with a loss of 8.30% in the average precision (AP) on the OpenImageV6 dataset and 0.77% in <inline-formula> <tex-math notation="LaTeX">$AP_{M}$ </tex-math></inline-formula> and 0.47% in <inline-formula> <tex-math notation="LaTeX">$AP_{L}$ </tex-math></inline-formula> on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates.
first_indexed	2024-04-09T20:41:12Z
format	Article
id	doaj.art-6df67ae77e6245adb4f69ed7b4aed9b8
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-04-09T20:41:12Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-6df67ae77e6245adb4f69ed7b4aed9b82023-03-29T23:00:19ZengIEEEIEEE Access2169-35362023-01-0111263082631910.1109/ACCESS.2023.325458910064277Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component AnalysisMinhun Lee0https://orcid.org/0000-0001-8165-5380Hansol Choi1Jihoon Kim2Jihoon Do3https://orcid.org/0000-0002-8254-2481Hyoungjin Kwon4Se Yoon Jeong5https://orcid.org/0000-0002-1675-4814Donggyu Sim6https://orcid.org/0000-0002-2794-9932Seoung-Jun Oh7https://orcid.org/0000-0002-7249-3647Department of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Electronic Engineering, Kwangwoon University, Seoul, South KoreaThis paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30% with a loss of 8.30% in the average precision (AP) on the OpenImageV6 dataset and 0.77% in <inline-formula> <tex-math notation="LaTeX">$AP_{M}$ </tex-math></inline-formula> and 0.47% in <inline-formula> <tex-math notation="LaTeX">$AP_{L}$ </tex-math></inline-formula> on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates.https://ieeexplore.ieee.org/document/10064277/Moving picture experts groupvideo coding for machinesconvolutional neural networkprincipal component analysisfeature map compression
spellingShingle	Minhun Lee Hansol Choi Jihoon Kim Jihoon Do Hyoungjin Kwon Se Yoon Jeong Donggyu Sim Seoung-Jun Oh Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis IEEE Access Moving picture experts group video coding for machines convolutional neural network principal component analysis feature map compression
title	Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_full	Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_fullStr	Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_full_unstemmed	Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_short	Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
title_sort	feature map compression for video coding for machines based on receptive block based principal component analysis
topic	Moving picture experts group video coding for machines convolutional neural network principal component analysis feature map compression
url	https://ieeexplore.ieee.org/document/10064277/
work_keys_str_mv	AT minhunlee featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT hansolchoi featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT jihoonkim featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT jihoondo featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT hyoungjinkwon featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT seyoonjeong featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT donggyusim featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT seoungjunoh featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis

Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis

Similar Items