An Unsupervised Monocular Image Depth Prediction Algorithm Based on Multiple Loss Deep Learning

In order to improve the predication accuracy with low execution time in the process of image depth map generation, we mainly investigate the unsupervised monocular image depth prediction. In this paper, an unsupervised monocular image depth prediction method based on multiple loss deep learning is d...

Full description

Bibliographic Details
Main Authors: Xiaojiao Tang, Lifang Chen
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8889754/
Description
Summary:In order to improve the predication accuracy with low execution time in the process of image depth map generation, we mainly investigate the unsupervised monocular image depth prediction. In this paper, an unsupervised monocular image depth prediction method based on multiple loss deep learning is designed from following two aspects. First, a monocular image depth estimation algorithm based on multi-scale feature extraction is proposed, which includes two parts: a feature extraction network and a deconvolution prediction network. The feature extraction network extracts image features at different levels of the network and introduces the acquired multi-scale features into the deconvolution layer, without changing the image resolution. Through training, the left and right disparity map can be eventually predicted. Second, we provide a new multiple loss function with the asymmetric parameters of the training model and constraint theorem of polar geometry. The Multi-Scale-Structural Similarity Index (MS-SSIM) algorithm and L1 algorithm are combined as the loss function of image reconstruction, the left-right disparity consistency and the flipped left-right disparity consistency are incorporated in the loss function of the network model training. The simulation results show that this method can effectively improve the prediction results accuracy, particularly for complex images with mirrors, transparent, and shadows. KITTI dataset is further utilized to evaluate our method, which can achieve end-to-end results that even exceed those of a supervised method.
ISSN:2169-3536