Self-supervised sparse-to-dense: Self-supervised depth completion from LiDAR and monocular camera

© 2019 IEEE. Depth completion, the technique of estimating a dense depth image from sparse depth measurements, has a variety of applications in robotics and autonomous driving. However, depth completion faces 3 main challenges: the irregularly spaced pattern in the sparse depth input, the difficulty...

Full description

Bibliographic Details
Main Authors: Ma, Fangchang, Venturelli Cavalheiro, Guilherme., Karaman, Sertac
Other Authors: Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Format: Article
Language:English
Published: IEEE 2020
Online Access:https://hdl.handle.net/1721.1/126545
Description
Summary:© 2019 IEEE. Depth completion, the technique of estimating a dense depth image from sparse depth measurements, has a variety of applications in robotics and autonomous driving. However, depth completion faces 3 main challenges: the irregularly spaced pattern in the sparse depth input, the difficulty in handling multiple sensor modalities (when color images are available), as well as the lack of dense, pixel-level ground truth depth labels for training. In this work, we address all these challenges. Specifically, we develop a deep regression model to learn a direct mapping from sparse depth (and color images) input to dense depth prediction. We also propose a self-supervised training framework that requires only sequences of color and sparse depth images, without the need for dense depth labels. Our experiments demonstrate that the self-supervised framework outperforms a number of existing solutions trained with semi-dense annotations. Furthermore, when trained with semi-dense annotations, our network attains state-of-the-art accuracy and is the winning approach on the KITTI depth completion benchmark² at the time of submission. Furthermore, the self-supervised framework outperforms a number of existing solutions trained with semi-dense annotations.