Learning dense prediction: from correspondence to segmentation

<p>Dense prediction is the task of predicting a label for each pixel in the image. Given 3D data (point clouds or RGB-D images) as input, dense prediction can also be extended to 3D space and assign each 3D point/location a label. According to the label type, dense prediction can be mainly cat...

Full description

Bibliographic Details
Main Author:	Zhang, F
Other Authors:	Torr, P
Format:	Thesis
Language:	English
Published:	2022
Subjects:	Computer vision Deep learning (Machine learning)

_version_	1797109591735009280
author	Zhang, F
author2	Torr, P
author_facet	Torr, P Zhang, F
author_sort	Zhang, F
collection	OXFORD
description	<p>Dense prediction is the task of predicting a label for each pixel in the image. Given 3D data (point clouds or RGB-D images) as input, dense prediction can also be extended to 3D space and assign each 3D point/location a label. According to the label type, dense prediction can be mainly categorized as depth estimation, motion prediction, segmentation, and other related tasks. There are four major challenges for learning dense predictions: i) how to significantly improve the accuracy and resolve the ambiguous regions, ii) high memory and computational costs, iii) the dependency on a large amount of labeled data for training, and iv) the poor cross-domain generalization to novel datasets.</p> <p>This integrated thesis focuses on dense prediction tasks, from correspondence estimation (stereo matching and optical flow) to 2D/3D semantic segmentation. Seven robust deep neural network models are proposed to achieve state-of-the-art accuracy, to realize effective training with just synthetic data or unlabeled real data, and to boost the cross-domain generalization to various unseen datasets.</p> <p>For the first task, traditional 3D geometry constraints are embedded into end-to-end trainable stereo matching networks to achieve state-of-the-art accuracy on two stereo matching benchmarks (by publication date). Based on this work, a domain-invariant stereo matching network is proposed. It is trained on the synthetic data but outperforms many models fine-tuned on real data. For the second task, a Separable Flow network is developed for optical flow estimation, which ranks the first on two standard optical flow benchmarks (by the time of publication). It's also one of the best methods for predicting optical flow on various unseen datasets. Moreover, research is also conducted on unsupervised pre-training and domain adaptation for semantic image segmentation. Finally, the 2D image segmentation knowledge is further leveraged for tackling 3D segmentation. The proposed 3D segmentation networks achieve the leading position on large-scale point-cloud segmentation benchmarks (at the time of publication).</p>
first_indexed	2024-03-07T07:42:20Z
format	Thesis
id	oxford-uuid:51cb4805-f932-41dd-9dab-d2c64b932ce7
institution	University of Oxford
language	English
last_indexed	2024-03-07T07:42:20Z
publishDate	2022
record_format	dspace
spelling	oxford-uuid:51cb4805-f932-41dd-9dab-d2c64b932ce72023-05-15T14:11:01ZLearning dense prediction: from correspondence to segmentationThesishttp://purl.org/coar/resource_type/c_db06uuid:51cb4805-f932-41dd-9dab-d2c64b932ce7Computer visionDeep learning (Machine learning)EnglishHyrax Deposit2022Zhang, FTorr, PPrisacariu, V<p>Dense prediction is the task of predicting a label for each pixel in the image. Given 3D data (point clouds or RGB-D images) as input, dense prediction can also be extended to 3D space and assign each 3D point/location a label. According to the label type, dense prediction can be mainly categorized as depth estimation, motion prediction, segmentation, and other related tasks. There are four major challenges for learning dense predictions: i) how to significantly improve the accuracy and resolve the ambiguous regions, ii) high memory and computational costs, iii) the dependency on a large amount of labeled data for training, and iv) the poor cross-domain generalization to novel datasets.</p> <p>This integrated thesis focuses on dense prediction tasks, from correspondence estimation (stereo matching and optical flow) to 2D/3D semantic segmentation. Seven robust deep neural network models are proposed to achieve state-of-the-art accuracy, to realize effective training with just synthetic data or unlabeled real data, and to boost the cross-domain generalization to various unseen datasets.</p> <p>For the first task, traditional 3D geometry constraints are embedded into end-to-end trainable stereo matching networks to achieve state-of-the-art accuracy on two stereo matching benchmarks (by publication date). Based on this work, a domain-invariant stereo matching network is proposed. It is trained on the synthetic data but outperforms many models fine-tuned on real data. For the second task, a Separable Flow network is developed for optical flow estimation, which ranks the first on two standard optical flow benchmarks (by the time of publication). It's also one of the best methods for predicting optical flow on various unseen datasets. Moreover, research is also conducted on unsupervised pre-training and domain adaptation for semantic image segmentation. Finally, the 2D image segmentation knowledge is further leveraged for tackling 3D segmentation. The proposed 3D segmentation networks achieve the leading position on large-scale point-cloud segmentation benchmarks (at the time of publication).</p>
spellingShingle	Computer vision Deep learning (Machine learning) Zhang, F Learning dense prediction: from correspondence to segmentation
title	Learning dense prediction: from correspondence to segmentation
title_full	Learning dense prediction: from correspondence to segmentation
title_fullStr	Learning dense prediction: from correspondence to segmentation
title_full_unstemmed	Learning dense prediction: from correspondence to segmentation
title_short	Learning dense prediction: from correspondence to segmentation
title_sort	learning dense prediction from correspondence to segmentation
topic	Computer vision Deep learning (Machine learning)
work_keys_str_mv	AT zhangf learningdensepredictionfromcorrespondencetosegmentation

Learning dense prediction: from correspondence to segmentation

Similar Items