Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios

In large-scale environments, scale drift is a crucial problem of monocular visual simultaneous localization and mapping (SLAM). A common solution is to utilize the camera height, which can be obtained using the reconstructed 3D ground points (3DGPs) from two successive frames, as prior knowledge. In...

Full description

Bibliographic Details
Main Authors: Ming Fan, Seung-Wook Kim, Sung-Tae Kim, Jee-Young Sun, Sung-Jea Ko
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9205188/
_version_ 1818431138686828544
author Ming Fan
Seung-Wook Kim
Sung-Tae Kim
Jee-Young Sun
Sung-Jea Ko
author_facet Ming Fan
Seung-Wook Kim
Sung-Tae Kim
Jee-Young Sun
Sung-Jea Ko
author_sort Ming Fan
collection DOAJ
description In large-scale environments, scale drift is a crucial problem of monocular visual simultaneous localization and mapping (SLAM). A common solution is to utilize the camera height, which can be obtained using the reconstructed 3D ground points (3DGPs) from two successive frames, as prior knowledge. Increasing the number of 3DGPs by using more proceeding frames can be a natural extension of this solution to estimate a more precise camera height. However, merely employing multiple frames based on conventional methods is hard to be directly applicable in a real-world scenario because the vehicle motion and inaccurate feature matching inevitably cause large uncertainty and noisy 3DGPs. In this study, we propose an elaborate method to collect confident 3DGPs from multiple frames for robust scale estimation. First, we gather 3DGP candidates that can be seen in more than a predefined number of frames. To verify the 3DGP candidates, we filter out the 3D points at the exterior of the road region obtained by the deep-learning-based road segmentation model. In addition, we formulate an optimization problem constrained by a simple but effective geometric assumption that the normal vector of the ground plane lies in the null space of a movement vector of the camera center, and provide a closed-form solution. ORB-SLAM with the proposed scale estimation method achieves the average translation error with 1.19% on the KITTI dataset, which outperforms the state-of-the-art conventional monocular visual SLAM methods in road driving scenarios.
first_indexed 2024-12-14T15:44:33Z
format Article
id doaj.art-86d0ecf4719542c5ac66ff571289890d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T15:44:33Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-86d0ecf4719542c5ac66ff571289890d2022-12-21T22:55:32ZengIEEEIEEE Access2169-35362020-01-01817589117590310.1109/ACCESS.2020.30263479205188Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving ScenariosMing Fan0https://orcid.org/0000-0002-0988-9091Seung-Wook Kim1https://orcid.org/0000-0002-6004-4086Sung-Tae Kim2https://orcid.org/0000-0001-6673-6924Jee-Young Sun3Sung-Jea Ko4https://orcid.org/0000-0002-4875-7091Department of Electrical Engineering, Korea University, Seoul, South KoreaDepartment of Electrical Engineering, Korea University, Seoul, South KoreaDepartment of Electrical Engineering, Korea University, Seoul, South KoreaDepartment of Electrical Engineering, Korea University, Seoul, South KoreaDepartment of Electrical Engineering, Korea University, Seoul, South KoreaIn large-scale environments, scale drift is a crucial problem of monocular visual simultaneous localization and mapping (SLAM). A common solution is to utilize the camera height, which can be obtained using the reconstructed 3D ground points (3DGPs) from two successive frames, as prior knowledge. Increasing the number of 3DGPs by using more proceeding frames can be a natural extension of this solution to estimate a more precise camera height. However, merely employing multiple frames based on conventional methods is hard to be directly applicable in a real-world scenario because the vehicle motion and inaccurate feature matching inevitably cause large uncertainty and noisy 3DGPs. In this study, we propose an elaborate method to collect confident 3DGPs from multiple frames for robust scale estimation. First, we gather 3DGP candidates that can be seen in more than a predefined number of frames. To verify the 3DGP candidates, we filter out the 3D points at the exterior of the road region obtained by the deep-learning-based road segmentation model. In addition, we formulate an optimization problem constrained by a simple but effective geometric assumption that the normal vector of the ground plane lies in the null space of a movement vector of the camera center, and provide a closed-form solution. ORB-SLAM with the proposed scale estimation method achieves the average translation error with 1.19% on the KITTI dataset, which outperforms the state-of-the-art conventional monocular visual SLAM methods in road driving scenarios.https://ieeexplore.ieee.org/document/9205188/Monocular SLAMscale estimation3D plane fitting
spellingShingle Ming Fan
Seung-Wook Kim
Sung-Tae Kim
Jee-Young Sun
Sung-Jea Ko
Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios
IEEE Access
Monocular SLAM
scale estimation
3D plane fitting
title Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios
title_full Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios
title_fullStr Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios
title_full_unstemmed Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios
title_short Simple But Effective Scale Estimation for Monocular Visual Odometry in Road Driving Scenarios
title_sort simple but effective scale estimation for monocular visual odometry in road driving scenarios
topic Monocular SLAM
scale estimation
3D plane fitting
url https://ieeexplore.ieee.org/document/9205188/
work_keys_str_mv AT mingfan simplebuteffectivescaleestimationformonocularvisualodometryinroaddrivingscenarios
AT seungwookkim simplebuteffectivescaleestimationformonocularvisualodometryinroaddrivingscenarios
AT sungtaekim simplebuteffectivescaleestimationformonocularvisualodometryinroaddrivingscenarios
AT jeeyoungsun simplebuteffectivescaleestimationformonocularvisualodometryinroaddrivingscenarios
AT sungjeako simplebuteffectivescaleestimationformonocularvisualodometryinroaddrivingscenarios