From local understanding to global regression in monocular visual odometry

The most significant part of any autonomous intelligent robot is the localization module that gives the robot knowledge about its position and orientation. This knowledge assists the robot to move to the location of its desired goal and complete its task. Visual Odometry (VO) measures the displaceme...

Full description

Bibliographic Details
Main Authors: Esfahani, Mahdi Abolfazli, Wu, Keyu, Yuan, Shenghai, Wang, Han
Other Authors: School of Electrical and Electronic Engineering
Format: Journal Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/155089
_version_ 1811681008432447488
author Esfahani, Mahdi Abolfazli
Wu, Keyu
Yuan, Shenghai
Wang, Han
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Esfahani, Mahdi Abolfazli
Wu, Keyu
Yuan, Shenghai
Wang, Han
author_sort Esfahani, Mahdi Abolfazli
collection NTU
description The most significant part of any autonomous intelligent robot is the localization module that gives the robot knowledge about its position and orientation. This knowledge assists the robot to move to the location of its desired goal and complete its task. Visual Odometry (VO) measures the displacement of the robots' camera in consecutive frames which results in the estimation of the robot position and orientation. Deep Learning, nowadays, helps to learn rich and informative features for the problem of VO to estimate frame-by-frame camera movement. Recent Deep Learning-based VO methods train an end-by-end network to solve VO as a regression problem directly without visualizing and sensing the label of training data in the training procedure. In this paper, a new approach to train Convolutional Neural Networks (CNNs) for the regression problems, such as VO, is proposed. The proposed method first changes the problem to a classification problem to learn different subspaces with similar observations. After solving the classification problem, the problem converts to the original regression problem to solve using the knowledge achieved by solving the classification problem. This approach helps CNN to solve regression problem globally in a local domain learned in the classification step, and improves the performance of the regression module for approximately 10%.
first_indexed 2024-10-01T03:34:07Z
format Journal Article
id ntu-10356/155089
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:34:07Z
publishDate 2022
record_format dspace
spelling ntu-10356/1550892022-02-11T06:43:20Z From local understanding to global regression in monocular visual odometry Esfahani, Mahdi Abolfazli Wu, Keyu Yuan, Shenghai Wang, Han School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Visual Odometry Deep Learning The most significant part of any autonomous intelligent robot is the localization module that gives the robot knowledge about its position and orientation. This knowledge assists the robot to move to the location of its desired goal and complete its task. Visual Odometry (VO) measures the displacement of the robots' camera in consecutive frames which results in the estimation of the robot position and orientation. Deep Learning, nowadays, helps to learn rich and informative features for the problem of VO to estimate frame-by-frame camera movement. Recent Deep Learning-based VO methods train an end-by-end network to solve VO as a regression problem directly without visualizing and sensing the label of training data in the training procedure. In this paper, a new approach to train Convolutional Neural Networks (CNNs) for the regression problems, such as VO, is proposed. The proposed method first changes the problem to a classification problem to learn different subspaces with similar observations. After solving the classification problem, the problem converts to the original regression problem to solve using the knowledge achieved by solving the classification problem. This approach helps CNN to solve regression problem globally in a local domain learned in the classification step, and improves the performance of the regression module for approximately 10%. 2022-02-11T06:42:18Z 2022-02-11T06:42:18Z 2020 Journal Article Esfahani, M. A., Wu, K., Yuan, S. & Wang, H. (2020). From local understanding to global regression in monocular visual odometry. International Journal of Pattern Recognition and Artificial Intelligence, 34(1), 2055002-. https://dx.doi.org/10.1142/S0218001420550022 0218-0014 https://hdl.handle.net/10356/155089 10.1142/S0218001420550022 2-s2.0-85066103166 1 34 2055002 en International Journal of Pattern Recognition and Artificial Intelligence © 2020 World Scientic Publishing Company. All rights reserved.
spellingShingle Engineering::Electrical and electronic engineering
Visual Odometry
Deep Learning
Esfahani, Mahdi Abolfazli
Wu, Keyu
Yuan, Shenghai
Wang, Han
From local understanding to global regression in monocular visual odometry
title From local understanding to global regression in monocular visual odometry
title_full From local understanding to global regression in monocular visual odometry
title_fullStr From local understanding to global regression in monocular visual odometry
title_full_unstemmed From local understanding to global regression in monocular visual odometry
title_short From local understanding to global regression in monocular visual odometry
title_sort from local understanding to global regression in monocular visual odometry
topic Engineering::Electrical and electronic engineering
Visual Odometry
Deep Learning
url https://hdl.handle.net/10356/155089
work_keys_str_mv AT esfahanimahdiabolfazli fromlocalunderstandingtoglobalregressioninmonocularvisualodometry
AT wukeyu fromlocalunderstandingtoglobalregressioninmonocularvisualodometry
AT yuanshenghai fromlocalunderstandingtoglobalregressioninmonocularvisualodometry
AT wanghan fromlocalunderstandingtoglobalregressioninmonocularvisualodometry