A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding
Recently, video and image compression methods using neural networks have received much attention. In MPEG standardization, Video Coding for Machine (VCM) is a newly arising topic which attempts to compress features/images for the purpose of machine vision tasks. Especially, compressing features has...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10078247/ |
_version_ | 1797849326156775424 |
---|---|
author | Jung-Heum Kang Muhammad Salman Ali Hye-Won Jeong Chang-Kyun Choi Younhee Kim Se Yoon Jeong Sung-Ho Bae Hui Yong Kim |
author_facet | Jung-Heum Kang Muhammad Salman Ali Hye-Won Jeong Chang-Kyun Choi Younhee Kim Se Yoon Jeong Sung-Ho Bae Hui Yong Kim |
author_sort | Jung-Heum Kang |
collection | DOAJ |
description | Recently, video and image compression methods using neural networks have received much attention. In MPEG standardization, Video Coding for Machine (VCM) is a newly arising topic which attempts to compress features/images for the purpose of machine vision tasks. Especially, compressing features has advantages in terms of privacy protection and computation off-loading. In this paper, we propose an effective feature compression method equipped with a super-resolution (SR) module for features. Our main motivation comes from the observation that features are somewhat robust to spatial distortions (e.g., AWGN, blur, quantization distortions, coding artifacts), which leads us to integrating an SR module into the compression framework. We also further explore the best training strategy of the proposed method, i.e., finding the best combination of various losses and proper input feature shapes. Our comprehensive experiments show that the proposed method outperforms the baseline in the original VCM anchor scenario on various QP values with Versatile Video Coding (VVC). Specifically, the proposed framework achieved up to 50% BD-rate reduction compared to the conventional P-layer feature map compression method for the object detection task on the OpenImage dataset. |
first_indexed | 2024-04-09T18:42:08Z |
format | Article |
id | doaj.art-b6a7bb195f694b359b47a043fd21d93a |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-09T18:42:08Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-b6a7bb195f694b359b47a043fd21d93a2023-04-10T23:01:31ZengIEEEIEEE Access2169-35362023-01-0111341983420910.1109/ACCESS.2023.326022310078247A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video CodingJung-Heum Kang0Muhammad Salman Ali1https://orcid.org/0000-0002-8548-3827Hye-Won Jeong2Chang-Kyun Choi3Younhee Kim4Se Yoon Jeong5https://orcid.org/0000-0002-1675-4814Sung-Ho Bae6https://orcid.org/0000-0003-2677-3186Hui Yong Kim7https://orcid.org/0000-0001-7308-133XDepartment of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of KoreaElectronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of KoreaDepartment of Computer Science and Engineering, Kyung Hee University, Yongin, Republic of KoreaRecently, video and image compression methods using neural networks have received much attention. In MPEG standardization, Video Coding for Machine (VCM) is a newly arising topic which attempts to compress features/images for the purpose of machine vision tasks. Especially, compressing features has advantages in terms of privacy protection and computation off-loading. In this paper, we propose an effective feature compression method equipped with a super-resolution (SR) module for features. Our main motivation comes from the observation that features are somewhat robust to spatial distortions (e.g., AWGN, blur, quantization distortions, coding artifacts), which leads us to integrating an SR module into the compression framework. We also further explore the best training strategy of the proposed method, i.e., finding the best combination of various losses and proper input feature shapes. Our comprehensive experiments show that the proposed method outperforms the baseline in the original VCM anchor scenario on various QP values with Versatile Video Coding (VVC). Specifically, the proposed framework achieved up to 50% BD-rate reduction compared to the conventional P-layer feature map compression method for the object detection task on the OpenImage dataset.https://ieeexplore.ieee.org/document/10078247/Versatile video codecvideo coding for machinefeature compressiondeep neural networksuper resolution |
spellingShingle | Jung-Heum Kang Muhammad Salman Ali Hye-Won Jeong Chang-Kyun Choi Younhee Kim Se Yoon Jeong Sung-Ho Bae Hui Yong Kim A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding IEEE Access Versatile video codec video coding for machine feature compression deep neural network super resolution |
title | A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding |
title_full | A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding |
title_fullStr | A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding |
title_full_unstemmed | A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding |
title_short | A Super-Resolution-Based Feature Map Compression for Machine-Oriented Video Coding |
title_sort | super resolution based feature map compression for machine oriented video coding |
topic | Versatile video codec video coding for machine feature compression deep neural network super resolution |
url | https://ieeexplore.ieee.org/document/10078247/ |
work_keys_str_mv | AT jungheumkang asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT muhammadsalmanali asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT hyewonjeong asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT changkyunchoi asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT younheekim asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT seyoonjeong asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT sunghobae asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT huiyongkim asuperresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT jungheumkang superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT muhammadsalmanali superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT hyewonjeong superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT changkyunchoi superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT younheekim superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT seyoonjeong superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT sunghobae superresolutionbasedfeaturemapcompressionformachineorientedvideocoding AT huiyongkim superresolutionbasedfeaturemapcompressionformachineorientedvideocoding |