Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning

In this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural...

Full description

Bibliographic Details
Main Authors: Dong-seok Lee, Soon-kak Kwon
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/24/9656
_version_ 1797455351735386112
author Dong-seok Lee
Soon-kak Kwon
author_facet Dong-seok Lee
Soon-kak Kwon
author_sort Dong-seok Lee
collection DOAJ
description In this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural network consists of both a spatial feature prediction network and a clustering network. The spatial feature prediction network utilizes spatial features in vertical and horizontal directions. The network contains a 1D CNN layer and a fully connected layer. The 1D CNN layer extracts the spatial features for a vertical direction and a horizontal direction from a top block and a left block of the reference pixels, respectively. 1D CNN is designed to handle time-series data, but it can also be applied to find the spatial features by regarding a pixel order in a certain direction as a timestamp. The fully connected layer predicts the spatial features of the block to be coded through the extracted features. The clustering network finds clusters from the spatial features which are the outputs of the spatial feature prediction network. The network consists of 4 CNN layers. The first 3 CNN layers combine two spatial features in the vertical and horizontal directions. The last layer outputs the probabilities that pixels belong to the clusters. The pixels of the block are predicted by the representative values of the clusters that are the average of the reference pixels belonging to the clusters. For the intra prediction for various block sizes, the block is scaled to the size of the network input. The prediction result through the proposed network is scaled back to the original size. In network training, the mean square error is used as a loss function between the original block and the predicted block. A penalty for output values far from both ends is introduced to the loss function for clear network clustering. In the simulation results, the bit rate is saved by up to 12.45% under the same distortion condition compared with the latest video coding standard.
first_indexed 2024-03-09T15:52:10Z
format Article
id doaj.art-0643b3f97f084f05a5c3597d1826f102
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T15:52:10Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-0643b3f97f084f05a5c3597d1826f1022023-11-24T17:53:08ZengMDPI AGSensors1424-82202022-12-012224965610.3390/s22249656Intra Prediction Method for Depth Video Coding by Block Clustering through Deep LearningDong-seok Lee0Soon-kak Kwon1AI Grand ICT Research Center, Dong-eui University, Busan 47340, Republic of KoreaDepartment of Computer Software Engineering, Dong-eui University, Busan 47340, Republic of KoreaIn this paper, we propose an intra-picture prediction method for depth video by a block clustering through a neural network. The proposed method solves a problem that the block that has two or more clusters drops the prediction performance of the intra prediction for depth video. The proposed neural network consists of both a spatial feature prediction network and a clustering network. The spatial feature prediction network utilizes spatial features in vertical and horizontal directions. The network contains a 1D CNN layer and a fully connected layer. The 1D CNN layer extracts the spatial features for a vertical direction and a horizontal direction from a top block and a left block of the reference pixels, respectively. 1D CNN is designed to handle time-series data, but it can also be applied to find the spatial features by regarding a pixel order in a certain direction as a timestamp. The fully connected layer predicts the spatial features of the block to be coded through the extracted features. The clustering network finds clusters from the spatial features which are the outputs of the spatial feature prediction network. The network consists of 4 CNN layers. The first 3 CNN layers combine two spatial features in the vertical and horizontal directions. The last layer outputs the probabilities that pixels belong to the clusters. The pixels of the block are predicted by the representative values of the clusters that are the average of the reference pixels belonging to the clusters. For the intra prediction for various block sizes, the block is scaled to the size of the network input. The prediction result through the proposed network is scaled back to the original size. In network training, the mean square error is used as a loss function between the original block and the predicted block. A penalty for output values far from both ends is introduced to the loss function for clear network clustering. In the simulation results, the bit rate is saved by up to 12.45% under the same distortion condition compared with the latest video coding standard.https://www.mdpi.com/1424-8220/22/24/9656intra predictiondepth video codingdeep learning1D CNNclustering
spellingShingle Dong-seok Lee
Soon-kak Kwon
Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
Sensors
intra prediction
depth video coding
deep learning
1D CNN
clustering
title Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
title_full Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
title_fullStr Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
title_full_unstemmed Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
title_short Intra Prediction Method for Depth Video Coding by Block Clustering through Deep Learning
title_sort intra prediction method for depth video coding by block clustering through deep learning
topic intra prediction
depth video coding
deep learning
1D CNN
clustering
url https://www.mdpi.com/1424-8220/22/24/9656
work_keys_str_mv AT dongseoklee intrapredictionmethodfordepthvideocodingbyblockclusteringthroughdeeplearning
AT soonkakkwon intrapredictionmethodfordepthvideocodingbyblockclusteringthroughdeeplearning