Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion

It is challenging to apply depth maps generated from sparse laser scan data to computer vision tasks, such as robot vision and autonomous driving, because of the sparsity and noise in the data. To overcome this problem, depth completion tasks have been proposed to produce a dense depth map from spar...

Full description

Bibliographic Details
Main Authors:	Sihaeng Lee, Janghyeon Lee, Doyeon Kim, Junmo Kim
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Depth estimation depth completion LiDAR data cross guidance multi-scale dilated convolutional block
Online Access:	https://ieeexplore.ieee.org/document/9078070/

_version_	1818619479633952768
author	Sihaeng Lee Janghyeon Lee Doyeon Kim Junmo Kim
author_facet	Sihaeng Lee Janghyeon Lee Doyeon Kim Junmo Kim
author_sort	Sihaeng Lee
collection	DOAJ
description	It is challenging to apply depth maps generated from sparse laser scan data to computer vision tasks, such as robot vision and autonomous driving, because of the sparsity and noise in the data. To overcome this problem, depth completion tasks have been proposed to produce a dense depth map from sparse LiDAR data and a single RGB image. In this study, we developed a deep convolutional architecture with cross guidance for multi-modal feature fusion to compensate for the lack of representation power of their modality. Two encoders, which are part of the proposed architecture, receive different modalities as inputs. They interact with each other by exchanging information in each stage through the attention mechanism during encoding. We also propose a residual atrous spatial pyramid block, comprising multiple dilated convolutions with different dilation rates, which are used to derive highly significant features. The experimental results of the KITTI depth completion benchmark dataset demonstrate that the proposed architecture shows higher performance than that of the other models trained in a two-dimensional space without pre-training or fine-tuning other datasets.
first_indexed	2024-12-16T17:38:09Z
format	Article
id	doaj.art-8391deaf3ee3477c9128f118d0d86cd9
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-16T17:38:09Z
publishDate	2020-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-8391deaf3ee3477c9128f118d0d86cd92022-12-21T22:22:41ZengIEEEIEEE Access2169-35362020-01-018798017981010.1109/ACCESS.2020.29902129078070Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth CompletionSihaeng Lee0https://orcid.org/0000-0001-5328-2011Janghyeon Lee1https://orcid.org/0000-0002-8599-4678Doyeon Kim2https://orcid.org/0000-0003-3717-7275Junmo Kim3https://orcid.org/0000-0002-7174-7932Division of Future Vehicle, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaSchool of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaSchool of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaDivision of Future Vehicle, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaIt is challenging to apply depth maps generated from sparse laser scan data to computer vision tasks, such as robot vision and autonomous driving, because of the sparsity and noise in the data. To overcome this problem, depth completion tasks have been proposed to produce a dense depth map from sparse LiDAR data and a single RGB image. In this study, we developed a deep convolutional architecture with cross guidance for multi-modal feature fusion to compensate for the lack of representation power of their modality. Two encoders, which are part of the proposed architecture, receive different modalities as inputs. They interact with each other by exchanging information in each stage through the attention mechanism during encoding. We also propose a residual atrous spatial pyramid block, comprising multiple dilated convolutions with different dilation rates, which are used to derive highly significant features. The experimental results of the KITTI depth completion benchmark dataset demonstrate that the proposed architecture shows higher performance than that of the other models trained in a two-dimensional space without pre-training or fine-tuning other datasets.https://ieeexplore.ieee.org/document/9078070/Depth estimationdepth completionLiDAR datacross guidancemulti-scale dilated convolutional block
spellingShingle	Sihaeng Lee Janghyeon Lee Doyeon Kim Junmo Kim Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion IEEE Access Depth estimation depth completion LiDAR data cross guidance multi-scale dilated convolutional block
title	Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion
title_full	Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion
title_fullStr	Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion
title_full_unstemmed	Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion
title_short	Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion
title_sort	deep architecture with cross guidance between single image and sparse lidar data for depth completion
topic	Depth estimation depth completion LiDAR data cross guidance multi-scale dilated convolutional block
url	https://ieeexplore.ieee.org/document/9078070/
work_keys_str_mv	AT sihaenglee deeparchitecturewithcrossguidancebetweensingleimageandsparselidardatafordepthcompletion AT janghyeonlee deeparchitecturewithcrossguidancebetweensingleimageandsparselidardatafordepthcompletion AT doyeonkim deeparchitecturewithcrossguidancebetweensingleimageandsparselidardatafordepthcompletion AT junmokim deeparchitecturewithcrossguidancebetweensingleimageandsparselidardatafordepthcompletion

Deep Architecture With Cross Guidance Between Single Image and Sparse LiDAR Data for Depth Completion

Similar Items