Rate-Distortion Optimized Encoding for Deep Image Compression
Deep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representation of the input image. Advanced techniques such as vector quantization, context-a...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Open Journal of Circuits and Systems |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9623337/ |
_version_ | 1819198919812317184 |
---|---|
author | Michael Schafer Sophie Pientka Jonathan Pfaff Heiko Schwarz Detlev Marpe Thomas Wiegand |
author_facet | Michael Schafer Sophie Pientka Jonathan Pfaff Heiko Schwarz Detlev Marpe Thomas Wiegand |
author_sort | Michael Schafer |
collection | DOAJ |
description | Deep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representation of the input image. Advanced techniques such as vector quantization, context-adaptive arithmetic coding and variable-rate compression have been implemented in these auto-encoders. Notably, these networks rely on an end-to-end approach, which fundamentally differs from hybrid, block-based video coding systems. Therefore, signal-dependent encoder optimizations have not been thoroughly investigated for VAEs yet. However, rate-distortion optimized encoding heavily determines the compression performance of state-of-the-art video codecs. Designing such optimizations for non-linear, multi-layered networks requires to understand the relationship between the quantization, the bit allocation of the features and the distortion. Therefore, this paper examines the rate-distortion performance of a variable-rate VAE. In particular, one demonstrates that the trained encoder network typically finds features with a near-optimal bit allocation across the channels. Furthermore, one approximates the relationship between distortion and quantization by a higher-order polynomial, whose coefficients can be robustly estimated. Based on these considerations, the authors investigate an encoding algorithm for the Lagrange optimization, which significantly improves the coding efficiency. |
first_indexed | 2024-12-23T03:08:06Z |
format | Article |
id | doaj.art-bb87eff80345459faf4e5ddac9532e1c |
institution | Directory Open Access Journal |
issn | 2644-1225 |
language | English |
last_indexed | 2024-12-23T03:08:06Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Open Journal of Circuits and Systems |
spelling | doaj.art-bb87eff80345459faf4e5ddac9532e1c2022-12-21T18:02:17ZengIEEEIEEE Open Journal of Circuits and Systems2644-12252021-01-01263364710.1109/OJCAS.2021.31249959623337Rate-Distortion Optimized Encoding for Deep Image CompressionMichael Schafer0https://orcid.org/0000-0003-0309-3161Sophie Pientka1https://orcid.org/0000-0001-9299-9939Jonathan Pfaff2https://orcid.org/0000-0002-3550-0596Heiko Schwarz3https://orcid.org/0000-0002-7136-0041Detlev Marpe4https://orcid.org/0000-0002-5391-3247Thomas Wiegand5https://orcid.org/0000-0002-1121-2581Video Communication and Applications Department, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, GermanyVideo Communication and Applications Department, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, GermanyVideo Communication and Applications Department, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, GermanyVideo Communication and Applications Department, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, GermanyVideo Communication and Applications Department, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, GermanyVideo Communication and Applications Department, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, GermanyDeep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representation of the input image. Advanced techniques such as vector quantization, context-adaptive arithmetic coding and variable-rate compression have been implemented in these auto-encoders. Notably, these networks rely on an end-to-end approach, which fundamentally differs from hybrid, block-based video coding systems. Therefore, signal-dependent encoder optimizations have not been thoroughly investigated for VAEs yet. However, rate-distortion optimized encoding heavily determines the compression performance of state-of-the-art video codecs. Designing such optimizations for non-linear, multi-layered networks requires to understand the relationship between the quantization, the bit allocation of the features and the distortion. Therefore, this paper examines the rate-distortion performance of a variable-rate VAE. In particular, one demonstrates that the trained encoder network typically finds features with a near-optimal bit allocation across the channels. Furthermore, one approximates the relationship between distortion and quantization by a higher-order polynomial, whose coefficients can be robustly estimated. Based on these considerations, the authors investigate an encoding algorithm for the Lagrange optimization, which significantly improves the coding efficiency.https://ieeexplore.ieee.org/document/9623337/Deep image compressionvariational auto-encodersrate-distortion optimized encodingnon-linear transform coding |
spellingShingle | Michael Schafer Sophie Pientka Jonathan Pfaff Heiko Schwarz Detlev Marpe Thomas Wiegand Rate-Distortion Optimized Encoding for Deep Image Compression IEEE Open Journal of Circuits and Systems Deep image compression variational auto-encoders rate-distortion optimized encoding non-linear transform coding |
title | Rate-Distortion Optimized Encoding for Deep Image Compression |
title_full | Rate-Distortion Optimized Encoding for Deep Image Compression |
title_fullStr | Rate-Distortion Optimized Encoding for Deep Image Compression |
title_full_unstemmed | Rate-Distortion Optimized Encoding for Deep Image Compression |
title_short | Rate-Distortion Optimized Encoding for Deep Image Compression |
title_sort | rate distortion optimized encoding for deep image compression |
topic | Deep image compression variational auto-encoders rate-distortion optimized encoding non-linear transform coding |
url | https://ieeexplore.ieee.org/document/9623337/ |
work_keys_str_mv | AT michaelschafer ratedistortionoptimizedencodingfordeepimagecompression AT sophiepientka ratedistortionoptimizedencodingfordeepimagecompression AT jonathanpfaff ratedistortionoptimizedencodingfordeepimagecompression AT heikoschwarz ratedistortionoptimizedencodingfordeepimagecompression AT detlevmarpe ratedistortionoptimizedencodingfordeepimagecompression AT thomaswiegand ratedistortionoptimizedencodingfordeepimagecompression |