Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
A 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose i...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co., Ltd.
2022-06-01
|
Series: | Virtual Reality & Intelligent Hardware |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2096579622000420 |
_version_ | 1818118335959662592 |
---|---|
author | Muhammad Irfan Muhammad Munsif |
author_facet | Muhammad Irfan Muhammad Munsif |
author_sort | Muhammad Irfan |
collection | DOAJ |
description | A 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose it. While automatically extracting user’s point of interest (UPI) in a 360° video is very challenging because of subjectivity and difference of comforts. To handle these challenges and provide user’s the best and visually pleasant view, we propose an automatic approach by utilizing two CNN models: object detector and aesthetic score of the scene. The proposed framework is three folded: pre-processing, Deepdive architecture, and view selection pipeline. In first fold, an input 360° video-frame is divided into three sub-frames, each one with 120° view. In second fold, each sub-frame is passed through CNN models to extract visual features in the sub-frames and calculate aesthetic score. Finally, decision pipeline selects the sub-frame with salient object based on the detected object and calculated aesthetic score. As compared to other state-of-the-art techniques which are domain specific approaches i.e., support sports 360° video, our system support most of the 360° videos genre. Performance evaluation of proposed framework on our own collected data from various websites indicate performance for different categories of 360° videos. |
first_indexed | 2024-12-11T04:52:41Z |
format | Article |
id | doaj.art-b1917a3e8b1748788be8bbd38e7eb44f |
institution | Directory Open Access Journal |
issn | 2096-5796 |
language | English |
last_indexed | 2024-12-11T04:52:41Z |
publishDate | 2022-06-01 |
publisher | KeAi Communications Co., Ltd. |
record_format | Article |
series | Virtual Reality & Intelligent Hardware |
spelling | doaj.art-b1917a3e8b1748788be8bbd38e7eb44f2022-12-22T01:20:20ZengKeAi Communications Co., Ltd.Virtual Reality & Intelligent Hardware2096-57962022-06-0143247262Deepdive: A Learning-Based Approach for Virtual Camera in Immersive ContentsMuhammad Irfan0Muhammad Munsif1Department of Computer Science, Oakland University, Michigan, USA; Corresponding author.Digital Image Processing Laboratory, Islamia College, Peshawar, PakistanA 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose it. While automatically extracting user’s point of interest (UPI) in a 360° video is very challenging because of subjectivity and difference of comforts. To handle these challenges and provide user’s the best and visually pleasant view, we propose an automatic approach by utilizing two CNN models: object detector and aesthetic score of the scene. The proposed framework is three folded: pre-processing, Deepdive architecture, and view selection pipeline. In first fold, an input 360° video-frame is divided into three sub-frames, each one with 120° view. In second fold, each sub-frame is passed through CNN models to extract visual features in the sub-frames and calculate aesthetic score. Finally, decision pipeline selects the sub-frame with salient object based on the detected object and calculated aesthetic score. As compared to other state-of-the-art techniques which are domain specific approaches i.e., support sports 360° video, our system support most of the 360° videos genre. Performance evaluation of proposed framework on our own collected data from various websites indicate performance for different categories of 360° videos.http://www.sciencedirect.com/science/article/pii/S2096579622000420Virtual RealityImmersive ContentsDeep LearningAestheticSaliency |
spellingShingle | Muhammad Irfan Muhammad Munsif Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents Virtual Reality & Intelligent Hardware Virtual Reality Immersive Contents Deep Learning Aesthetic Saliency |
title | Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents |
title_full | Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents |
title_fullStr | Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents |
title_full_unstemmed | Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents |
title_short | Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents |
title_sort | deepdive a learning based approach for virtual camera in immersive contents |
topic | Virtual Reality Immersive Contents Deep Learning Aesthetic Saliency |
url | http://www.sciencedirect.com/science/article/pii/S2096579622000420 |
work_keys_str_mv | AT muhammadirfan deepdivealearningbasedapproachforvirtualcamerainimmersivecontents AT muhammadmunsif deepdivealearningbasedapproachforvirtualcamerainimmersivecontents |