Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents

A 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose i...

Full description

Bibliographic Details
Main Authors: Muhammad Irfan, Muhammad Munsif
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2022-06-01
Series:Virtual Reality & Intelligent Hardware
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2096579622000420
_version_ 1818118335959662592
author Muhammad Irfan
Muhammad Munsif
author_facet Muhammad Irfan
Muhammad Munsif
author_sort Muhammad Irfan
collection DOAJ
description A 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose it. While automatically extracting user’s point of interest (UPI) in a 360° video is very challenging because of subjectivity and difference of comforts. To handle these challenges and provide user’s the best and visually pleasant view, we propose an automatic approach by utilizing two CNN models: object detector and aesthetic score of the scene. The proposed framework is three folded: pre-processing, Deepdive architecture, and view selection pipeline. In first fold, an input 360° video-frame is divided into three sub-frames, each one with 120° view. In second fold, each sub-frame is passed through CNN models to extract visual features in the sub-frames and calculate aesthetic score. Finally, decision pipeline selects the sub-frame with salient object based on the detected object and calculated aesthetic score. As compared to other state-of-the-art techniques which are domain specific approaches i.e., support sports 360° video, our system support most of the 360° videos genre. Performance evaluation of proposed framework on our own collected data from various websites indicate performance for different categories of 360° videos.
first_indexed 2024-12-11T04:52:41Z
format Article
id doaj.art-b1917a3e8b1748788be8bbd38e7eb44f
institution Directory Open Access Journal
issn 2096-5796
language English
last_indexed 2024-12-11T04:52:41Z
publishDate 2022-06-01
publisher KeAi Communications Co., Ltd.
record_format Article
series Virtual Reality & Intelligent Hardware
spelling doaj.art-b1917a3e8b1748788be8bbd38e7eb44f2022-12-22T01:20:20ZengKeAi Communications Co., Ltd.Virtual Reality & Intelligent Hardware2096-57962022-06-0143247262Deepdive: A Learning-Based Approach for Virtual Camera in Immersive ContentsMuhammad Irfan0Muhammad Munsif1Department of Computer Science, Oakland University, Michigan, USA; Corresponding author.Digital Image Processing Laboratory, Islamia College, Peshawar, PakistanA 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose it. While automatically extracting user’s point of interest (UPI) in a 360° video is very challenging because of subjectivity and difference of comforts. To handle these challenges and provide user’s the best and visually pleasant view, we propose an automatic approach by utilizing two CNN models: object detector and aesthetic score of the scene. The proposed framework is three folded: pre-processing, Deepdive architecture, and view selection pipeline. In first fold, an input 360° video-frame is divided into three sub-frames, each one with 120° view. In second fold, each sub-frame is passed through CNN models to extract visual features in the sub-frames and calculate aesthetic score. Finally, decision pipeline selects the sub-frame with salient object based on the detected object and calculated aesthetic score. As compared to other state-of-the-art techniques which are domain specific approaches i.e., support sports 360° video, our system support most of the 360° videos genre. Performance evaluation of proposed framework on our own collected data from various websites indicate performance for different categories of 360° videos.http://www.sciencedirect.com/science/article/pii/S2096579622000420Virtual RealityImmersive ContentsDeep LearningAestheticSaliency
spellingShingle Muhammad Irfan
Muhammad Munsif
Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
Virtual Reality & Intelligent Hardware
Virtual Reality
Immersive Contents
Deep Learning
Aesthetic
Saliency
title Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_full Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_fullStr Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_full_unstemmed Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_short Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_sort deepdive a learning based approach for virtual camera in immersive contents
topic Virtual Reality
Immersive Contents
Deep Learning
Aesthetic
Saliency
url http://www.sciencedirect.com/science/article/pii/S2096579622000420
work_keys_str_mv AT muhammadirfan deepdivealearningbasedapproachforvirtualcamerainimmersivecontents
AT muhammadmunsif deepdivealearningbasedapproachforvirtualcamerainimmersivecontents