Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding

Versatile Video Coding (VVC), the state-of-the-art video coding standard, was developed by the Joint Video Experts Team (JVET) of ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) in 2020. Although VVC can provide powerful coding performance, it requires tremend...

Full description

Bibliographic Details
Main Authors: Taesik Lee, Dongsan Jun
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/12/2685
_version_ 1797595096729780224
author Taesik Lee
Dongsan Jun
author_facet Taesik Lee
Dongsan Jun
author_sort Taesik Lee
collection DOAJ
description Versatile Video Coding (VVC), the state-of-the-art video coding standard, was developed by the Joint Video Experts Team (JVET) of ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) in 2020. Although VVC can provide powerful coding performance, it requires tremendous computational complexity to determine the optimal mode decision during the encoding process. In particular, VVC adopted the bi-prediction with CU-level weight (BCW) as one of the new tools, which enhanced the coding efficiency of conventional bi-prediction by assigning different weights to the two prediction blocks in the process of inter prediction. In this study, we investigate the statistical characteristics of input features that exhibit a correlation with the BCW and define four useful types of categories to facilitate the inter prediction of VVC. With the investigated input features, a lightweight neural network with multilayer perceptron (MLP) architecture is designed to provide high accuracy and low complexity. We propose a fast BCW mode decision method with a lightweight MLP to reduce the computational complexity of the weighted multiple bi-prediction in the VVC encoder. The experimental results show that the proposed method significantly reduced the BCW encoding complexity by up to 33% with unnoticeable coding loss, compared to the VVC test model (VTM) under the random-access (RA) configuration.
first_indexed 2024-03-11T02:32:38Z
format Article
id doaj.art-587de840b9594a678446b82aa57b727c
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-11T02:32:38Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-587de840b9594a678446b82aa57b727c2023-11-18T10:09:12ZengMDPI AGElectronics2079-92922023-06-011212268510.3390/electronics12122685Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video CodingTaesik Lee0Dongsan Jun1Department of Computer Engineering, Dong-A University, Busan 49315, Republic of KoreaDepartment of Computer Engineering, Dong-A University, Busan 49315, Republic of KoreaVersatile Video Coding (VVC), the state-of-the-art video coding standard, was developed by the Joint Video Experts Team (JVET) of ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) in 2020. Although VVC can provide powerful coding performance, it requires tremendous computational complexity to determine the optimal mode decision during the encoding process. In particular, VVC adopted the bi-prediction with CU-level weight (BCW) as one of the new tools, which enhanced the coding efficiency of conventional bi-prediction by assigning different weights to the two prediction blocks in the process of inter prediction. In this study, we investigate the statistical characteristics of input features that exhibit a correlation with the BCW and define four useful types of categories to facilitate the inter prediction of VVC. With the investigated input features, a lightweight neural network with multilayer perceptron (MLP) architecture is designed to provide high accuracy and low complexity. We propose a fast BCW mode decision method with a lightweight MLP to reduce the computational complexity of the weighted multiple bi-prediction in the VVC encoder. The experimental results show that the proposed method significantly reduced the BCW encoding complexity by up to 33% with unnoticeable coding loss, compared to the VVC test model (VTM) under the random-access (RA) configuration.https://www.mdpi.com/2079-9292/12/12/2685bi-prediction with CU-level weight (BCW)complexity reductioninter predictionmultilayer perceptron (MLP)neural networkversatile video coding (VVC)
spellingShingle Taesik Lee
Dongsan Jun
Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding
Electronics
bi-prediction with CU-level weight (BCW)
complexity reduction
inter prediction
multilayer perceptron (MLP)
neural network
versatile video coding (VVC)
title Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding
title_full Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding
title_fullStr Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding
title_full_unstemmed Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding
title_short Fast Mode Decision Method of Multiple Weighted Bi-Predictions Using Lightweight Multilayer Perceptron in Versatile Video Coding
title_sort fast mode decision method of multiple weighted bi predictions using lightweight multilayer perceptron in versatile video coding
topic bi-prediction with CU-level weight (BCW)
complexity reduction
inter prediction
multilayer perceptron (MLP)
neural network
versatile video coding (VVC)
url https://www.mdpi.com/2079-9292/12/12/2685
work_keys_str_mv AT taesiklee fastmodedecisionmethodofmultipleweightedbipredictionsusinglightweightmultilayerperceptroninversatilevideocoding
AT dongsanjun fastmodedecisionmethodofmultipleweightedbipredictionsusinglightweightmultilayerperceptroninversatilevideocoding