CNN and HEVC Video Coding Features for Static Video Summarization

This study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we co...

Full description

Bibliographic Details
Main Authors: Obada Issa, Tamer Shanableh
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9815254/
_version_ 1828149658644381696
author Obada Issa
Tamer Shanableh
author_facet Obada Issa
Tamer Shanableh
author_sort Obada Issa
collection DOAJ
description This study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we converted the original YUVs of the raw videos into RGB images and fed them into pretrained CNN networks for feature extraction. These include GoogleNet, AlexNet, Inception-ResNet-v2, and VGG16. The modified datasets are made publicly available to the research community. Before detecting keyframes in a video, it is important to identify and eliminate duplicate or similar video frames. A subset of the proposed HEVC feature set was used to identify these frames and eliminate them from the video. We also propose an elimination solution based on the sum of the absolute differences between a frame and its motion-compensated predecessor. The proposed solutions are compared with existing works based on an SIFT flow algorithm that uses CNN features. Subsequently, an optional dimensionality reduction based on stepwise regression was applied to the feature vectors prior to detecting key frames. The proposed solution is compared with existing studies that use sparse autoencoders with CNN features for dimensionality reduction. The accuracy of the proposed key-frame detection system was assessed using the positive predictive values, sensitivity, and F-scores. Combining the proposed solution with Multi-CNN features and using a random forest classifier, it was shown that the proposed solution achieved an average F-score of 0.98.
first_indexed 2024-04-11T21:33:30Z
format Article
id doaj.art-e8763327ec07450d8b3b9291d138ed98
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T21:33:30Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-e8763327ec07450d8b3b9291d138ed982022-12-22T04:01:53ZengIEEEIEEE Access2169-35362022-01-0110720807209110.1109/ACCESS.2022.31886389815254CNN and HEVC Video Coding Features for Static Video SummarizationObada Issa0https://orcid.org/0000-0001-8093-218XTamer Shanableh1https://orcid.org/0000-0002-7651-3094Department of Computer Science and Engineering, American University of Sharjah, Sharjah, United Arab EmiratesDepartment of Computer Science and Engineering, American University of Sharjah, Sharjah, United Arab EmiratesThis study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we converted the original YUVs of the raw videos into RGB images and fed them into pretrained CNN networks for feature extraction. These include GoogleNet, AlexNet, Inception-ResNet-v2, and VGG16. The modified datasets are made publicly available to the research community. Before detecting keyframes in a video, it is important to identify and eliminate duplicate or similar video frames. A subset of the proposed HEVC feature set was used to identify these frames and eliminate them from the video. We also propose an elimination solution based on the sum of the absolute differences between a frame and its motion-compensated predecessor. The proposed solutions are compared with existing works based on an SIFT flow algorithm that uses CNN features. Subsequently, an optional dimensionality reduction based on stepwise regression was applied to the feature vectors prior to detecting key frames. The proposed solution is compared with existing studies that use sparse autoencoders with CNN features for dimensionality reduction. The accuracy of the proposed key-frame detection system was assessed using the positive predictive values, sensitivity, and F-scores. Combining the proposed solution with Multi-CNN features and using a random forest classifier, it was shown that the proposed solution achieved an average F-score of 0.98.https://ieeexplore.ieee.org/document/9815254/Convolution neural networkduplicate framessparse auto encodersstatic video summarizationvideo codinghigh efficiency video codec (HEVC)
spellingShingle Obada Issa
Tamer Shanableh
CNN and HEVC Video Coding Features for Static Video Summarization
IEEE Access
Convolution neural network
duplicate frames
sparse auto encoders
static video summarization
video coding
high efficiency video codec (HEVC)
title CNN and HEVC Video Coding Features for Static Video Summarization
title_full CNN and HEVC Video Coding Features for Static Video Summarization
title_fullStr CNN and HEVC Video Coding Features for Static Video Summarization
title_full_unstemmed CNN and HEVC Video Coding Features for Static Video Summarization
title_short CNN and HEVC Video Coding Features for Static Video Summarization
title_sort cnn and hevc video coding features for static video summarization
topic Convolution neural network
duplicate frames
sparse auto encoders
static video summarization
video coding
high efficiency video codec (HEVC)
url https://ieeexplore.ieee.org/document/9815254/
work_keys_str_mv AT obadaissa cnnandhevcvideocodingfeaturesforstaticvideosummarization
AT tamershanableh cnnandhevcvideocodingfeaturesforstaticvideosummarization