Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis
This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a si...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10064277/ |
_version_ | 1797856485536956416 |
---|---|
author | Minhun Lee Hansol Choi Jihoon Kim Jihoon Do Hyoungjin Kwon Se Yoon Jeong Donggyu Sim Seoung-Jun Oh |
author_facet | Minhun Lee Hansol Choi Jihoon Kim Jihoon Do Hyoungjin Kwon Se Yoon Jeong Donggyu Sim Seoung-Jun Oh |
author_sort | Minhun Lee |
collection | DOAJ |
description | This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30% with a loss of 8.30% in the average precision (AP) on the OpenImageV6 dataset and 0.77% in <inline-formula> <tex-math notation="LaTeX">$AP_{M}$ </tex-math></inline-formula> and 0.47% in <inline-formula> <tex-math notation="LaTeX">$AP_{L}$ </tex-math></inline-formula> on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates. |
first_indexed | 2024-04-09T20:41:12Z |
format | Article |
id | doaj.art-6df67ae77e6245adb4f69ed7b4aed9b8 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-09T20:41:12Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-6df67ae77e6245adb4f69ed7b4aed9b82023-03-29T23:00:19ZengIEEEIEEE Access2169-35362023-01-0111263082631910.1109/ACCESS.2023.325458910064277Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component AnalysisMinhun Lee0https://orcid.org/0000-0001-8165-5380Hansol Choi1Jihoon Kim2Jihoon Do3https://orcid.org/0000-0002-8254-2481Hyoungjin Kwon4Se Yoon Jeong5https://orcid.org/0000-0002-1675-4814Donggyu Sim6https://orcid.org/0000-0002-2794-9932Seoung-Jun Oh7https://orcid.org/0000-0002-7249-3647Department of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaMedia Coding Research Section, Electronics and Telecommunications Research Institute, Daejeon, South KoreaDepartment of Computer Engineering, Kwangwoon University, Seoul, South KoreaDepartment of Electronic Engineering, Kwangwoon University, Seoul, South KoreaThis paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video Coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30% with a loss of 8.30% in the average precision (AP) on the OpenImageV6 dataset and 0.77% in <inline-formula> <tex-math notation="LaTeX">$AP_{M}$ </tex-math></inline-formula> and 0.47% in <inline-formula> <tex-math notation="LaTeX">$AP_{L}$ </tex-math></inline-formula> on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates.https://ieeexplore.ieee.org/document/10064277/Moving picture experts groupvideo coding for machinesconvolutional neural networkprincipal component analysisfeature map compression |
spellingShingle | Minhun Lee Hansol Choi Jihoon Kim Jihoon Do Hyoungjin Kwon Se Yoon Jeong Donggyu Sim Seoung-Jun Oh Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis IEEE Access Moving picture experts group video coding for machines convolutional neural network principal component analysis feature map compression |
title | Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis |
title_full | Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis |
title_fullStr | Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis |
title_full_unstemmed | Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis |
title_short | Feature Map Compression for Video Coding for Machines Based on Receptive Block Based Principal Component Analysis |
title_sort | feature map compression for video coding for machines based on receptive block based principal component analysis |
topic | Moving picture experts group video coding for machines convolutional neural network principal component analysis feature map compression |
url | https://ieeexplore.ieee.org/document/10064277/ |
work_keys_str_mv | AT minhunlee featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT hansolchoi featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT jihoonkim featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT jihoondo featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT hyoungjinkwon featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT seyoonjeong featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT donggyusim featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis AT seoungjunoh featuremapcompressionforvideocodingformachinesbasedonreceptiveblockbasedprincipalcomponentanalysis |