Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal

In dynamic indoor environments and for a Visual Simultaneous Localization and Mapping (vSLAM) system to operate, moving objects should be considered because they could affect the system’s visual odometer stability and its position estimation accuracy. vSLAM can use feature points or a sequence of im...

Full description

Bibliographic Details
Main Authors: Charalambos Theodorou, Vladan Velisavljevic, Vladimir Dyo
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/19/7553
_version_ 1797476894109597696
author Charalambos Theodorou
Vladan Velisavljevic
Vladimir Dyo
author_facet Charalambos Theodorou
Vladan Velisavljevic
Vladimir Dyo
author_sort Charalambos Theodorou
collection DOAJ
description In dynamic indoor environments and for a Visual Simultaneous Localization and Mapping (vSLAM) system to operate, moving objects should be considered because they could affect the system’s visual odometer stability and its position estimation accuracy. vSLAM can use feature points or a sequence of images, as it is the only source of input that can perform localization while simultaneously creating a map of the environment. A vSLAM system based on ORB-SLAM3 and on YOLOR was proposed in this paper. The newly proposed system in combination with an object detection model (YOLOX) applied on extracted feature points is capable of achieving 2–4% better accuracy compared to VPS-SLAM and DS-SLAM. Static feature points such as signs and benches were used to calculate the camera position, and dynamic moving objects were eliminated by using the tracking thread. A specific custom personal dataset that includes indoor and outdoor RGB-D pictures of train stations, including dynamic objects and high density of people, ground truth data, sequence data, and video recordings of the train stations and X, Y, Z data was used to validate and evaluate the proposed method. The results show that ORB-SLAM3 with YOLOR as object detection achieves 89.54% of accuracy in dynamic indoor environments compared to previous systems such as VPS-SLAM.
first_indexed 2024-03-09T21:10:13Z
format Article
id doaj.art-8656cbfeb965404c917cf5dd4297be0c
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T21:10:13Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-8656cbfeb965404c917cf5dd4297be0c2023-11-23T21:51:01ZengMDPI AGSensors1424-82202022-10-012219755310.3390/s22197553Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object RemovalCharalambos Theodorou0Vladan Velisavljevic1Vladimir Dyo2School of Computer Science and Technology, University of Bedforshire, Luton LU1 3JU, UKSchool of Computer Science and Technology, University of Bedforshire, Luton LU1 3JU, UKSchool of Computer Science and Technology, University of Bedforshire, Luton LU1 3JU, UKIn dynamic indoor environments and for a Visual Simultaneous Localization and Mapping (vSLAM) system to operate, moving objects should be considered because they could affect the system’s visual odometer stability and its position estimation accuracy. vSLAM can use feature points or a sequence of images, as it is the only source of input that can perform localization while simultaneously creating a map of the environment. A vSLAM system based on ORB-SLAM3 and on YOLOR was proposed in this paper. The newly proposed system in combination with an object detection model (YOLOX) applied on extracted feature points is capable of achieving 2–4% better accuracy compared to VPS-SLAM and DS-SLAM. Static feature points such as signs and benches were used to calculate the camera position, and dynamic moving objects were eliminated by using the tracking thread. A specific custom personal dataset that includes indoor and outdoor RGB-D pictures of train stations, including dynamic objects and high density of people, ground truth data, sequence data, and video recordings of the train stations and X, Y, Z data was used to validate and evaluate the proposed method. The results show that ORB-SLAM3 with YOLOR as object detection achieves 89.54% of accuracy in dynamic indoor environments compared to previous systems such as VPS-SLAM.https://www.mdpi.com/1424-8220/22/19/7553visual SLAMobject detectionsimultaneous localization and mapping (SLAM)
spellingShingle Charalambos Theodorou
Vladan Velisavljevic
Vladimir Dyo
Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
Sensors
visual SLAM
object detection
simultaneous localization and mapping (SLAM)
title Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
title_full Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
title_fullStr Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
title_full_unstemmed Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
title_short Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
title_sort visual slam for dynamic environments based on object detection and optical flow for dynamic object removal
topic visual SLAM
object detection
simultaneous localization and mapping (SLAM)
url https://www.mdpi.com/1424-8220/22/19/7553
work_keys_str_mv AT charalambostheodorou visualslamfordynamicenvironmentsbasedonobjectdetectionandopticalflowfordynamicobjectremoval
AT vladanvelisavljevic visualslamfordynamicenvironmentsbasedonobjectdetectionandopticalflowfordynamicobjectremoval
AT vladimirdyo visualslamfordynamicenvironmentsbasedonobjectdetectionandopticalflowfordynamicobjectremoval