A Real-Time Indoor Scene Analysis Method Based on RGBD Stream

Indoor scene analysis is important for some applications, such as augmented reality. Specifically, we confine the indoor scene analysis problem to two aspects: reconstructing the geometry and understanding the observed element. Traditional analysis methods focusing on 2D RGB images are limited becau...

Full description

Bibliographic Details
Main Authors: Chen Wang, Yue Qi
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8851216/
_version_ 1828908145845469184
author Chen Wang
Yue Qi
author_facet Chen Wang
Yue Qi
author_sort Chen Wang
collection DOAJ
description Indoor scene analysis is important for some applications, such as augmented reality. Specifically, we confine the indoor scene analysis problem to two aspects: reconstructing the geometry and understanding the observed element. Traditional analysis methods focusing on 2D RGB images are limited because of the lack of stereo measurements, and thus RGBD-based indoor scene analysis has received wide attention. However, geometric analysis and semantic analysis have been treated as two separate problems in most recent works, leaving scene analysis incomplete. In our work, we combine a deep network architecture with a 3D online reconstruction algorithm and propose a complete pipeline to simultaneously analyse the indoor scene from the geometric level and the semantic level. We take a live depth camera as input and consider the scene analysis as two steps. The first step estimates the camera pose and labels scene objects for a single view. The second step fuses the scene objects into an integrated map for a global view. Specifically, we first transfer the input frame to geometric maps and propose a structural constraint iterative closest point (SC-ICP) algorithm for camera tracking. Then, we propose a structural constraint recurrent neural network (SC-RNN) to generate a semantic map for each frame. Finally, the geometric maps and semantic maps from multiple viewpoints can be fused into a complete model according to the camera pose. Our method can improve the accuracy of camera pose estimation and significantly analyse 3D indoor scenes consisting of high-quality geometric details and rich semantic information. It is noteworthy that our method can meet real-time requirements with frame rates of ≈25 Hz.
first_indexed 2024-12-13T18:06:06Z
format Article
id doaj.art-308a4f378d7340918796332a86e87d81
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-13T18:06:06Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-308a4f378d7340918796332a86e87d812022-12-21T23:36:04ZengIEEEIEEE Access2169-35362019-01-01716733616735010.1109/ACCESS.2019.29441408851216A Real-Time Indoor Scene Analysis Method Based on RGBD StreamChen Wang0https://orcid.org/0000-0002-4334-6103Yue Qi1State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, ChinaState Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, ChinaIndoor scene analysis is important for some applications, such as augmented reality. Specifically, we confine the indoor scene analysis problem to two aspects: reconstructing the geometry and understanding the observed element. Traditional analysis methods focusing on 2D RGB images are limited because of the lack of stereo measurements, and thus RGBD-based indoor scene analysis has received wide attention. However, geometric analysis and semantic analysis have been treated as two separate problems in most recent works, leaving scene analysis incomplete. In our work, we combine a deep network architecture with a 3D online reconstruction algorithm and propose a complete pipeline to simultaneously analyse the indoor scene from the geometric level and the semantic level. We take a live depth camera as input and consider the scene analysis as two steps. The first step estimates the camera pose and labels scene objects for a single view. The second step fuses the scene objects into an integrated map for a global view. Specifically, we first transfer the input frame to geometric maps and propose a structural constraint iterative closest point (SC-ICP) algorithm for camera tracking. Then, we propose a structural constraint recurrent neural network (SC-RNN) to generate a semantic map for each frame. Finally, the geometric maps and semantic maps from multiple viewpoints can be fused into a complete model according to the camera pose. Our method can improve the accuracy of camera pose estimation and significantly analyse 3D indoor scenes consisting of high-quality geometric details and rich semantic information. It is noteworthy that our method can meet real-time requirements with frame rates of ≈25 Hz.https://ieeexplore.ieee.org/document/8851216/Indoor scene analysiscamera trackingrecurrent neural networkssemantic segmentation
spellingShingle Chen Wang
Yue Qi
A Real-Time Indoor Scene Analysis Method Based on RGBD Stream
IEEE Access
Indoor scene analysis
camera tracking
recurrent neural networks
semantic segmentation
title A Real-Time Indoor Scene Analysis Method Based on RGBD Stream
title_full A Real-Time Indoor Scene Analysis Method Based on RGBD Stream
title_fullStr A Real-Time Indoor Scene Analysis Method Based on RGBD Stream
title_full_unstemmed A Real-Time Indoor Scene Analysis Method Based on RGBD Stream
title_short A Real-Time Indoor Scene Analysis Method Based on RGBD Stream
title_sort real time indoor scene analysis method based on rgbd stream
topic Indoor scene analysis
camera tracking
recurrent neural networks
semantic segmentation
url https://ieeexplore.ieee.org/document/8851216/
work_keys_str_mv AT chenwang arealtimeindoorsceneanalysismethodbasedonrgbdstream
AT yueqi arealtimeindoorsceneanalysismethodbasedonrgbdstream
AT chenwang realtimeindoorsceneanalysismethodbasedonrgbdstream
AT yueqi realtimeindoorsceneanalysismethodbasedonrgbdstream