Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments

Computer vision-based depth estimation and visual odometry provide perceptual information useful for robot navigation tasks like obstacle avoidance. However, despite the proliferation of state-of-the-art convolutional neural network (CNN) models for monocular depth, ego-motion and optical flow estim...

Full description

Bibliographic Details
Main Authors:	Fuseini Mumuni, Alhassan Mumuni, Christian Kwaku Amuzuvi
Format:	Article
Language:	English
Published:	Elsevier 2022-12-01
Series:	Machine Learning with Applications
Subjects:	Computer vision Monocular depth Optical flow Camera pose Obstacle avoidance Visual odometry
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666827022000913

_version_	1797980911912878080
author	Fuseini Mumuni Alhassan Mumuni Christian Kwaku Amuzuvi
author_facet	Fuseini Mumuni Alhassan Mumuni Christian Kwaku Amuzuvi
author_sort	Fuseini Mumuni
collection	DOAJ
description	Computer vision-based depth estimation and visual odometry provide perceptual information useful for robot navigation tasks like obstacle avoidance. However, despite the proliferation of state-of-the-art convolutional neural network (CNN) models for monocular depth, ego-motion and optical flow estimation, a relatively low volume of work has been reported on their practical applications in unmanned aerial vehicle (UAV) navigation. This is due to well-known challenges — embedded hardware constraints, viewpoint variations, scarcity of aerial image datasets, and intricacies of dynamic environments. We address these limitations to facilitate real-world deployment of CNN in UAV navigation. First, we devise efficient confidence weighted adaptive network (Cowan) training framework that iteratively leverages intermediate prediction confidences to enforce cross-task consistency over corresponding image regions. This achieves competitive accuracy with a lightweight CNN capable of real-time execution on resource-constrained embedded systems. Second, we devise a test-time refinement method that adapts the network to dynamic environments while simultaneously improving accuracy. To accomplish this, we first update ego-motion using pose information from on-board inertial measurement unit (IMU). Then, we decompose the UAV’s motion into constituent vectors, and for each axis, we formulate geometric relationships between depth and translation. Based on this information, we triangulate corresponding points acquired through optical flow. Finally, we enforce geometric consistency between the initially updated pose and triangulated depth. Cowan with geometric guided refinement (Cowan-GGR) achieves significant accuracy and robustness. Field tests show the proposed model is capable of accurate depth and object-level motion perception in real-world dynamic environments, thus proving its efficacy in facilitating UAV navigation.
first_indexed	2024-04-11T06:02:13Z
format	Article
id	doaj.art-a4483d75b2294a4681b6714cb193a50a
institution	Directory Open Access Journal
issn	2666-8270
language	English
last_indexed	2024-04-11T06:02:13Z
publishDate	2022-12-01
publisher	Elsevier
record_format	Article
series	Machine Learning with Applications
spelling	doaj.art-a4483d75b2294a4681b6714cb193a50a2022-12-22T04:41:39ZengElsevierMachine Learning with Applications2666-82702022-12-0110100416Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environmentsFuseini Mumuni0Alhassan Mumuni1Christian Kwaku Amuzuvi2Department of Electrical and Electronic Engineering, University of Mines and Technology (UMaT), Tarkwa, Ghana; Corresponding author.Department of Electrical and Electronic Engineering, Cape Coast Technical University, Cape Coast, GhanaDepartment of Renewable Energy Engineering, University of Mines and Technology (UMaT), Tarkwa, GhanaComputer vision-based depth estimation and visual odometry provide perceptual information useful for robot navigation tasks like obstacle avoidance. However, despite the proliferation of state-of-the-art convolutional neural network (CNN) models for monocular depth, ego-motion and optical flow estimation, a relatively low volume of work has been reported on their practical applications in unmanned aerial vehicle (UAV) navigation. This is due to well-known challenges — embedded hardware constraints, viewpoint variations, scarcity of aerial image datasets, and intricacies of dynamic environments. We address these limitations to facilitate real-world deployment of CNN in UAV navigation. First, we devise efficient confidence weighted adaptive network (Cowan) training framework that iteratively leverages intermediate prediction confidences to enforce cross-task consistency over corresponding image regions. This achieves competitive accuracy with a lightweight CNN capable of real-time execution on resource-constrained embedded systems. Second, we devise a test-time refinement method that adapts the network to dynamic environments while simultaneously improving accuracy. To accomplish this, we first update ego-motion using pose information from on-board inertial measurement unit (IMU). Then, we decompose the UAV’s motion into constituent vectors, and for each axis, we formulate geometric relationships between depth and translation. Based on this information, we triangulate corresponding points acquired through optical flow. Finally, we enforce geometric consistency between the initially updated pose and triangulated depth. Cowan with geometric guided refinement (Cowan-GGR) achieves significant accuracy and robustness. Field tests show the proposed model is capable of accurate depth and object-level motion perception in real-world dynamic environments, thus proving its efficacy in facilitating UAV navigation.http://www.sciencedirect.com/science/article/pii/S2666827022000913Computer visionMonocular depthOptical flowCamera poseObstacle avoidanceVisual odometry
spellingShingle	Fuseini Mumuni Alhassan Mumuni Christian Kwaku Amuzuvi Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments Machine Learning with Applications Computer vision Monocular depth Optical flow Camera pose Obstacle avoidance Visual odometry
title	Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments
title_full	Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments
title_fullStr	Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments
title_full_unstemmed	Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments
title_short	Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments
title_sort	deep learning of monocular depth optical flow and ego motion with geometric guidance for uav navigation in dynamic environments
topic	Computer vision Monocular depth Optical flow Camera pose Obstacle avoidance Visual odometry
url	http://www.sciencedirect.com/science/article/pii/S2666827022000913
work_keys_str_mv	AT fuseinimumuni deeplearningofmonoculardepthopticalflowandegomotionwithgeometricguidanceforuavnavigationindynamicenvironments AT alhassanmumuni deeplearningofmonoculardepthopticalflowandegomotionwithgeometricguidanceforuavnavigationindynamicenvironments AT christiankwakuamuzuvi deeplearningofmonoculardepthopticalflowandegomotionwithgeometricguidanceforuavnavigationindynamicenvironments

Deep learning of monocular depth, optical flow and ego-motion with geometric guidance for UAV navigation in dynamic environments

Similar Items