CNN and HEVC Video Coding Features for Static Video Summarization
This study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we co...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9815254/ |
_version_ | 1828149658644381696 |
---|---|
author | Obada Issa Tamer Shanableh |
author_facet | Obada Issa Tamer Shanableh |
author_sort | Obada Issa |
collection | DOAJ |
description | This study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we converted the original YUVs of the raw videos into RGB images and fed them into pretrained CNN networks for feature extraction. These include GoogleNet, AlexNet, Inception-ResNet-v2, and VGG16. The modified datasets are made publicly available to the research community. Before detecting keyframes in a video, it is important to identify and eliminate duplicate or similar video frames. A subset of the proposed HEVC feature set was used to identify these frames and eliminate them from the video. We also propose an elimination solution based on the sum of the absolute differences between a frame and its motion-compensated predecessor. The proposed solutions are compared with existing works based on an SIFT flow algorithm that uses CNN features. Subsequently, an optional dimensionality reduction based on stepwise regression was applied to the feature vectors prior to detecting key frames. The proposed solution is compared with existing studies that use sparse autoencoders with CNN features for dimensionality reduction. The accuracy of the proposed key-frame detection system was assessed using the positive predictive values, sensitivity, and F-scores. Combining the proposed solution with Multi-CNN features and using a random forest classifier, it was shown that the proposed solution achieved an average F-score of 0.98. |
first_indexed | 2024-04-11T21:33:30Z |
format | Article |
id | doaj.art-e8763327ec07450d8b3b9291d138ed98 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T21:33:30Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e8763327ec07450d8b3b9291d138ed982022-12-22T04:01:53ZengIEEEIEEE Access2169-35362022-01-0110720807209110.1109/ACCESS.2022.31886389815254CNN and HEVC Video Coding Features for Static Video SummarizationObada Issa0https://orcid.org/0000-0001-8093-218XTamer Shanableh1https://orcid.org/0000-0002-7651-3094Department of Computer Science and Engineering, American University of Sharjah, Sharjah, United Arab EmiratesDepartment of Computer Science and Engineering, American University of Sharjah, Sharjah, United Arab EmiratesThis study proposes a novel solution for the detection of keyframes for static video summarization. We preprocessed the well-known video datasets by coding them using the HEVC video coding standard. During coding, 64 proposed features were generated from the coder for each frame. Additionally, we converted the original YUVs of the raw videos into RGB images and fed them into pretrained CNN networks for feature extraction. These include GoogleNet, AlexNet, Inception-ResNet-v2, and VGG16. The modified datasets are made publicly available to the research community. Before detecting keyframes in a video, it is important to identify and eliminate duplicate or similar video frames. A subset of the proposed HEVC feature set was used to identify these frames and eliminate them from the video. We also propose an elimination solution based on the sum of the absolute differences between a frame and its motion-compensated predecessor. The proposed solutions are compared with existing works based on an SIFT flow algorithm that uses CNN features. Subsequently, an optional dimensionality reduction based on stepwise regression was applied to the feature vectors prior to detecting key frames. The proposed solution is compared with existing studies that use sparse autoencoders with CNN features for dimensionality reduction. The accuracy of the proposed key-frame detection system was assessed using the positive predictive values, sensitivity, and F-scores. Combining the proposed solution with Multi-CNN features and using a random forest classifier, it was shown that the proposed solution achieved an average F-score of 0.98.https://ieeexplore.ieee.org/document/9815254/Convolution neural networkduplicate framessparse auto encodersstatic video summarizationvideo codinghigh efficiency video codec (HEVC) |
spellingShingle | Obada Issa Tamer Shanableh CNN and HEVC Video Coding Features for Static Video Summarization IEEE Access Convolution neural network duplicate frames sparse auto encoders static video summarization video coding high efficiency video codec (HEVC) |
title | CNN and HEVC Video Coding Features for Static Video Summarization |
title_full | CNN and HEVC Video Coding Features for Static Video Summarization |
title_fullStr | CNN and HEVC Video Coding Features for Static Video Summarization |
title_full_unstemmed | CNN and HEVC Video Coding Features for Static Video Summarization |
title_short | CNN and HEVC Video Coding Features for Static Video Summarization |
title_sort | cnn and hevc video coding features for static video summarization |
topic | Convolution neural network duplicate frames sparse auto encoders static video summarization video coding high efficiency video codec (HEVC) |
url | https://ieeexplore.ieee.org/document/9815254/ |
work_keys_str_mv | AT obadaissa cnnandhevcvideocodingfeaturesforstaticvideosummarization AT tamershanableh cnnandhevcvideocodingfeaturesforstaticvideosummarization |