Depth perception in challenging weathers

Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching base...

Full description

Bibliographic Details
Main Author: Zhang, Haoyuan
Other Authors: Wang Dan Wei
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166568
_version_ 1811686465535475712
author Zhang, Haoyuan
author2 Wang Dan Wei
author_facet Wang Dan Wei
Zhang, Haoyuan
author_sort Zhang, Haoyuan
collection NTU
description Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching based on dual cameras, and depth completion based on a single camera and sparse Lidar. According to the properties of different modalities involved, the reliability and robustness are naturally divergent. In recent years, although the deep-learning-based solutions boost the performance of various approaches, most methods are aimed only at ideal conditions with satisfying illumination and clear view. However, as outdoor environments are unavoidable for robots and autonomous vehicles, depth perception in the various light conditions and weather effects is a meaningful task and open problem. In this thesis, various approaches to acquiring dense depth maps and different strategies to promote the deep-neural-network (DNN) models are investigated. Based on the novel frameworks proposed in the thesis, the empirical experiments have shown significant improvement in various experimental settings, which soften the difficulty of acquiring training data in various conditions. First of all, since the color image is the most mature modality in deep learning and stereo matching provides complete theory, the approach to acquiring depth maps in challenging conditions by stereo images is investigated. A supervised transfer learning strategy is applied to a delicately designed integration of a condition-specific perception enhancement network and a fast convolutional stereo matching algorithm. Besides, as the effectiveness of neural networks is greatly influenced by the quantity of data and the collection of synthetic data is much easier than that of real-world normal data, an unsupervised domain adaptation framework is proposed and validated in the synthetic to a real-world setting. Moreover, a novel loss function named soft warping loss is proposed to not only speed up the training process but also promote better performance. The synthetic to real-world setting is validated in most existing based on the assumption that weather and light conditions in the synthetic domain are less challenging than real-world data. To further test the effectiveness of the proposed methods, experiments on synthetic ideal weather to real-world adverse weather data are conducted, which verifies that the framework and domain are independent. Although the unsupervised domain adaptation for stereo matching requires no label on the target domain, the natural vulnerability of color cameras can lead to poor robustness under adverse conditions. To leverage the instinctive robustness of the Lidar modality in various light conditions, a review of modality fusion approaches are conducted, which validates the existing fusion method suffers from redundancy and lack of geometry guidance. To reduce the channel redundancy and embed the geometry information in the spatial dimension, the Geometry Attention-based A lightweight Fusion (GALiF) backbone is proposed. Besides, an entropy-based loss function is proposed to fulfill the potential of weighted summation. Based on the maximum entropy theory, the loss leads both branches to compete, resulting in a more active performance of different parts of the network. With the proposed methods, the fusion of color modality and Lidar modality achieves state-of- the-art performance in the literature on benchmarks. Finally, a label-free training strategy is proposed for depth completion, which is based only on noisy Lidar data and color images. The problem setting is most challenging due to the tremendously complicated process of label acquisition. The whole pipeline relies on a self-supervised learning framework and pseudo label generation. We developed a statistical filter and utilized the conventional domain-free method to generate dense depth maps as final pseudo maps. As the commonly used label collection condenses several Lidar frames into one dense depth map, the procedure is infeasible in adverse weather due to the low quality of a single frame. Moreover, the GAN-based style transfer is also not feasible due to rare adverse weather data collected. The proposed self-supervised learning framework explores a novel approach to realize fine-tuning without labels or external expensive sensors. Experiments on real-world adverse weather data show a significant error drop approximately ranging from 30%-50% according to different degrees of noise induced by the weather.
first_indexed 2024-10-01T05:00:51Z
format Thesis-Doctor of Philosophy
id ntu-10356/166568
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:00:51Z
publishDate 2023
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1665682023-07-04T17:02:37Z Depth perception in challenging weathers Zhang, Haoyuan Wang Dan Wei School of Electrical and Electronic Engineering EDWWANG@ntu.edu.sg Engineering::Electrical and electronic engineering Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching based on dual cameras, and depth completion based on a single camera and sparse Lidar. According to the properties of different modalities involved, the reliability and robustness are naturally divergent. In recent years, although the deep-learning-based solutions boost the performance of various approaches, most methods are aimed only at ideal conditions with satisfying illumination and clear view. However, as outdoor environments are unavoidable for robots and autonomous vehicles, depth perception in the various light conditions and weather effects is a meaningful task and open problem. In this thesis, various approaches to acquiring dense depth maps and different strategies to promote the deep-neural-network (DNN) models are investigated. Based on the novel frameworks proposed in the thesis, the empirical experiments have shown significant improvement in various experimental settings, which soften the difficulty of acquiring training data in various conditions. First of all, since the color image is the most mature modality in deep learning and stereo matching provides complete theory, the approach to acquiring depth maps in challenging conditions by stereo images is investigated. A supervised transfer learning strategy is applied to a delicately designed integration of a condition-specific perception enhancement network and a fast convolutional stereo matching algorithm. Besides, as the effectiveness of neural networks is greatly influenced by the quantity of data and the collection of synthetic data is much easier than that of real-world normal data, an unsupervised domain adaptation framework is proposed and validated in the synthetic to a real-world setting. Moreover, a novel loss function named soft warping loss is proposed to not only speed up the training process but also promote better performance. The synthetic to real-world setting is validated in most existing based on the assumption that weather and light conditions in the synthetic domain are less challenging than real-world data. To further test the effectiveness of the proposed methods, experiments on synthetic ideal weather to real-world adverse weather data are conducted, which verifies that the framework and domain are independent. Although the unsupervised domain adaptation for stereo matching requires no label on the target domain, the natural vulnerability of color cameras can lead to poor robustness under adverse conditions. To leverage the instinctive robustness of the Lidar modality in various light conditions, a review of modality fusion approaches are conducted, which validates the existing fusion method suffers from redundancy and lack of geometry guidance. To reduce the channel redundancy and embed the geometry information in the spatial dimension, the Geometry Attention-based A lightweight Fusion (GALiF) backbone is proposed. Besides, an entropy-based loss function is proposed to fulfill the potential of weighted summation. Based on the maximum entropy theory, the loss leads both branches to compete, resulting in a more active performance of different parts of the network. With the proposed methods, the fusion of color modality and Lidar modality achieves state-of- the-art performance in the literature on benchmarks. Finally, a label-free training strategy is proposed for depth completion, which is based only on noisy Lidar data and color images. The problem setting is most challenging due to the tremendously complicated process of label acquisition. The whole pipeline relies on a self-supervised learning framework and pseudo label generation. We developed a statistical filter and utilized the conventional domain-free method to generate dense depth maps as final pseudo maps. As the commonly used label collection condenses several Lidar frames into one dense depth map, the procedure is infeasible in adverse weather due to the low quality of a single frame. Moreover, the GAN-based style transfer is also not feasible due to rare adverse weather data collected. The proposed self-supervised learning framework explores a novel approach to realize fine-tuning without labels or external expensive sensors. Experiments on real-world adverse weather data show a significant error drop approximately ranging from 30%-50% according to different degrees of noise induced by the weather. Doctor of Philosophy 2023-05-04T05:14:49Z 2023-05-04T05:14:49Z 2023 Thesis-Doctor of Philosophy Zhang, H. (2023). Depth perception in challenging weathers. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166568 https://hdl.handle.net/10356/166568 10.32657/10356/166568 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle Engineering::Electrical and electronic engineering
Zhang, Haoyuan
Depth perception in challenging weathers
title Depth perception in challenging weathers
title_full Depth perception in challenging weathers
title_fullStr Depth perception in challenging weathers
title_full_unstemmed Depth perception in challenging weathers
title_short Depth perception in challenging weathers
title_sort depth perception in challenging weathers
topic Engineering::Electrical and electronic engineering
url https://hdl.handle.net/10356/166568
work_keys_str_mv AT zhanghaoyuan depthperceptioninchallengingweathers