A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences

Recently, deep convolutional neural networks (CNN) have become popular for indoor visual localisation, where the networks learn to regress the camera pose from images directly. However, these approaches perform a 3D image-based reconstruction of the indoor spaces beforehand to determine camera poses...

Full description

Bibliographic Details
Main Authors:	Debaditya Acharya, Sesa Singha Roy, Kourosh Khoshelham, Stephan Winter
Format:	Article
Language:	English
Published:	MDPI AG 2020-09-01
Series:	Sensors
Subjects:	indoor localisation camera pose regression 3D building models long short term memory
Online Access:	https://www.mdpi.com/1424-8220/20/19/5492

_version_	1797552606038458368
author	Debaditya Acharya Sesa Singha Roy Kourosh Khoshelham Stephan Winter
author_facet	Debaditya Acharya Sesa Singha Roy Kourosh Khoshelham Stephan Winter
author_sort	Debaditya Acharya
collection	DOAJ
description	Recently, deep convolutional neural networks (CNN) have become popular for indoor visual localisation, where the networks learn to regress the camera pose from images directly. However, these approaches perform a 3D image-based reconstruction of the indoor spaces beforehand to determine camera poses, which is a challenge for large indoor spaces. Synthetic images derived from 3D indoor models have been used to eliminate the requirement of 3D reconstruction. A limitation of the approach is the low accuracy that occurs as a result of estimating the pose of each image frame independently. In this article, a visual localisation approach is proposed that exploits the spatio-temporal information from synthetic image sequences to improve localisation accuracy. A deep Bayesian recurrent CNN is fine-tuned using synthetic image sequences obtained from a building information model (BIM) to regress the pose of real image sequences. The results of the experiments indicate that the proposed approach estimates a smoother trajectory with smaller inter-frame error as compared to existing methods. The achievable accuracy with the proposed approach is 1.6 m, which is an improvement of approximately thirty per cent compared to the existing approaches. A Keras implementation can be found in our Github repository.
first_indexed	2024-03-10T16:03:29Z
format	Article
id	doaj.art-39154f4832484f3a921b58351622c599
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-10T16:03:29Z
publishDate	2020-09-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-39154f4832484f3a921b58351622c5992023-11-20T15:03:19ZengMDPI AGSensors1424-82202020-09-012019549210.3390/s20195492A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image SequencesDebaditya Acharya0Sesa Singha Roy1Kourosh Khoshelham2Stephan Winter3Department of Infrastructure Engineering, The University of Melbourne, Parkville, Victoria 3010, AustraliaInstitute for Sustainable Industries and Livable Cities, Victoria University, Werribee, Victoria 3030, AustraliaDepartment of Infrastructure Engineering, The University of Melbourne, Parkville, Victoria 3010, AustraliaDepartment of Infrastructure Engineering, The University of Melbourne, Parkville, Victoria 3010, AustraliaRecently, deep convolutional neural networks (CNN) have become popular for indoor visual localisation, where the networks learn to regress the camera pose from images directly. However, these approaches perform a 3D image-based reconstruction of the indoor spaces beforehand to determine camera poses, which is a challenge for large indoor spaces. Synthetic images derived from 3D indoor models have been used to eliminate the requirement of 3D reconstruction. A limitation of the approach is the low accuracy that occurs as a result of estimating the pose of each image frame independently. In this article, a visual localisation approach is proposed that exploits the spatio-temporal information from synthetic image sequences to improve localisation accuracy. A deep Bayesian recurrent CNN is fine-tuned using synthetic image sequences obtained from a building information model (BIM) to regress the pose of real image sequences. The results of the experiments indicate that the proposed approach estimates a smoother trajectory with smaller inter-frame error as compared to existing methods. The achievable accuracy with the proposed approach is 1.6 m, which is an improvement of approximately thirty per cent compared to the existing approaches. A Keras implementation can be found in our Github repository.https://www.mdpi.com/1424-8220/20/19/5492indoor localisationcamera pose regression3D building modelslong short term memory
spellingShingle	Debaditya Acharya Sesa Singha Roy Kourosh Khoshelham Stephan Winter A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences Sensors indoor localisation camera pose regression 3D building models long short term memory
title	A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences
title_full	A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences
title_fullStr	A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences
title_full_unstemmed	A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences
title_short	A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences
title_sort	recurrent deep network for estimating the pose of real indoor images from synthetic image sequences
topic	indoor localisation camera pose regression 3D building models long short term memory
url	https://www.mdpi.com/1424-8220/20/19/5492
work_keys_str_mv	AT debadityaacharya arecurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT sesasingharoy arecurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT kouroshkhoshelham arecurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT stephanwinter arecurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT debadityaacharya recurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT sesasingharoy recurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT kouroshkhoshelham recurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences AT stephanwinter recurrentdeepnetworkforestimatingtheposeofrealindoorimagesfromsyntheticimagesequences

A Recurrent Deep Network for Estimating the Pose of Real Indoor Images from Synthetic Image Sequences

Similar Items