Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents

A 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose i...

Full description

Bibliographic Details
Main Authors:	Muhammad Irfan, Muhammad Munsif
Format:	Article
Language:	English
Published:	KeAi Communications Co., Ltd. 2022-06-01
Series:	Virtual Reality & Intelligent Hardware
Subjects:	Virtual Reality Immersive Contents Deep Learning Aesthetic Saliency
Online Access:	http://www.sciencedirect.com/science/article/pii/S2096579622000420

_version_	1818118335959662592
author	Muhammad Irfan Muhammad Munsif
author_facet	Muhammad Irfan Muhammad Munsif
author_sort	Muhammad Irfan
collection	DOAJ
description	A 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose it. While automatically extracting user’s point of interest (UPI) in a 360° video is very challenging because of subjectivity and difference of comforts. To handle these challenges and provide user’s the best and visually pleasant view, we propose an automatic approach by utilizing two CNN models: object detector and aesthetic score of the scene. The proposed framework is three folded: pre-processing, Deepdive architecture, and view selection pipeline. In first fold, an input 360° video-frame is divided into three sub-frames, each one with 120° view. In second fold, each sub-frame is passed through CNN models to extract visual features in the sub-frames and calculate aesthetic score. Finally, decision pipeline selects the sub-frame with salient object based on the detected object and calculated aesthetic score. As compared to other state-of-the-art techniques which are domain specific approaches i.e., support sports 360° video, our system support most of the 360° videos genre. Performance evaluation of proposed framework on our own collected data from various websites indicate performance for different categories of 360° videos.
first_indexed	2024-12-11T04:52:41Z
format	Article
id	doaj.art-b1917a3e8b1748788be8bbd38e7eb44f
institution	Directory Open Access Journal
issn	2096-5796
language	English
last_indexed	2024-12-11T04:52:41Z
publishDate	2022-06-01
publisher	KeAi Communications Co., Ltd.
record_format	Article
series	Virtual Reality & Intelligent Hardware
spelling	doaj.art-b1917a3e8b1748788be8bbd38e7eb44f2022-12-22T01:20:20ZengKeAi Communications Co., Ltd.Virtual Reality & Intelligent Hardware2096-57962022-06-0143247262Deepdive: A Learning-Based Approach for Virtual Camera in Immersive ContentsMuhammad Irfan0Muhammad Munsif1Department of Computer Science, Oakland University, Michigan, USA; Corresponding author.Digital Image Processing Laboratory, Islamia College, Peshawar, PakistanA 360° video stream provide users a choice of viewing one’s own point of interest inside the immersive contents. Performing head or hand manipulations to view the interesting scene in a 360° video is very tedious and the user may view the interested frame during his head/hand movement or even lose it. While automatically extracting user’s point of interest (UPI) in a 360° video is very challenging because of subjectivity and difference of comforts. To handle these challenges and provide user’s the best and visually pleasant view, we propose an automatic approach by utilizing two CNN models: object detector and aesthetic score of the scene. The proposed framework is three folded: pre-processing, Deepdive architecture, and view selection pipeline. In first fold, an input 360° video-frame is divided into three sub-frames, each one with 120° view. In second fold, each sub-frame is passed through CNN models to extract visual features in the sub-frames and calculate aesthetic score. Finally, decision pipeline selects the sub-frame with salient object based on the detected object and calculated aesthetic score. As compared to other state-of-the-art techniques which are domain specific approaches i.e., support sports 360° video, our system support most of the 360° videos genre. Performance evaluation of proposed framework on our own collected data from various websites indicate performance for different categories of 360° videos.http://www.sciencedirect.com/science/article/pii/S2096579622000420Virtual RealityImmersive ContentsDeep LearningAestheticSaliency
spellingShingle	Muhammad Irfan Muhammad Munsif Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents Virtual Reality & Intelligent Hardware Virtual Reality Immersive Contents Deep Learning Aesthetic Saliency
title	Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_full	Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_fullStr	Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_full_unstemmed	Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_short	Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents
title_sort	deepdive a learning based approach for virtual camera in immersive contents
topic	Virtual Reality Immersive Contents Deep Learning Aesthetic Saliency
url	http://www.sciencedirect.com/science/article/pii/S2096579622000420
work_keys_str_mv	AT muhammadirfan deepdivealearningbasedapproachforvirtualcamerainimmersivecontents AT muhammadmunsif deepdivealearningbasedapproachforvirtualcamerainimmersivecontents

Deepdive: A Learning-Based Approach for Virtual Camera in Immersive Contents

Similar Items