Effective representations for road scene understanding with scalable learning

<p>Autonomous vehicles require an accurate understanding of the scene for safe operation in real-world driving scenarios. This thesis examines and offers effective representations for road scene understanding based on in-situ perception, which can be employed directly for planning and decision...

Full description

Bibliographic Details
Main Author:	Bruls, TAH
Other Authors:	Newman, P
Format:	Thesis
Language:	English
Published:	2020
Subjects:	Robot vision Automated vehicles

_version_	1826315938093334528
author	Bruls, TAH
author2	Newman, P
author_facet	Newman, P Bruls, TAH
author_sort	Bruls, TAH
collection	OXFORD
description	<p>Autonomous vehicles require an accurate understanding of the scene for safe operation in real-world driving scenarios. This thesis examines and offers effective representations for road scene understanding based on in-situ perception, which can be employed directly for planning and decision making in a variety of complex urban environments and under wide-ranging environmental conditions.</p> <p>A common, versatile scene representation is the pixel-wise semantic segmentation obtained by deep neural networks. However, this representation is limited in its direct usefulness for several reasons. Firstly, the pixel-wise semantic segmentation does not naturally support the high-level reasoning required for complex driving manoeuvres. This thesis resolves that limitation by focussing on a hierarchical, graph-based representation, the scene graph, which combines segmented entities and object-centric perception in bird's-eye view at a suitable abstraction level for decision making. The road markings are a crucial prerequisite in this representation as their underlying meaning dictates the desired driving behaviour. Secondly, semantic segmentation often lacks this semantic understanding of the road markings due to the inordinate cost of labelling adequate training data. Instead, this thesis presents and compares a model-driven and data-driven approach for self-supervised road marking classification. Whereas the former leverages additional sensor modalities and domain knowledge, the latter employs state-of-the-art image-to-image translation techniques to synthesize training data. Thirdly, semantic segmentation is commonly performed in the front-facing perspective, which does not explicitly encode distances or directly link to the vehicle's action space. We tackle this problem by learning an improved bird's-eye-view mapping, called "boosted IPM", which aids scene graph generation in real-world scenarios.</p> <p>In addressing the above, we introduce scalable, self-supervised learning techniques by employing design principles such as transfer learning, leveraging domain knowledge, and data synthesis. Furthermore, we improve the robustness of the representations under wide-ranging environmental conditions by image restoration and learning appearance-invariant representations. The presented methodologies serve as a valuable starting point to devise effective and efficient representations for road scene understanding based on in-situ perception in real-world scenarios. </p>
first_indexed	2024-03-06T19:52:57Z
format	Thesis
id	oxford-uuid:24943f86-67db-44b6-95c4-1dcedd6422aa
institution	University of Oxford
language	English
last_indexed	2024-12-09T03:35:13Z
publishDate	2020
record_format	dspace
spelling	oxford-uuid:24943f86-67db-44b6-95c4-1dcedd6422aa2024-12-01T18:46:38ZEffective representations for road scene understanding with scalable learningThesishttp://purl.org/coar/resource_type/c_db06uuid:24943f86-67db-44b6-95c4-1dcedd6422aaRobot visionAutomated vehiclesEnglishHyrax Deposit2020Bruls, TAHNewman, PHawes, NNebot, E<p>Autonomous vehicles require an accurate understanding of the scene for safe operation in real-world driving scenarios. This thesis examines and offers effective representations for road scene understanding based on in-situ perception, which can be employed directly for planning and decision making in a variety of complex urban environments and under wide-ranging environmental conditions.</p> <p>A common, versatile scene representation is the pixel-wise semantic segmentation obtained by deep neural networks. However, this representation is limited in its direct usefulness for several reasons. Firstly, the pixel-wise semantic segmentation does not naturally support the high-level reasoning required for complex driving manoeuvres. This thesis resolves that limitation by focussing on a hierarchical, graph-based representation, the scene graph, which combines segmented entities and object-centric perception in bird's-eye view at a suitable abstraction level for decision making. The road markings are a crucial prerequisite in this representation as their underlying meaning dictates the desired driving behaviour. Secondly, semantic segmentation often lacks this semantic understanding of the road markings due to the inordinate cost of labelling adequate training data. Instead, this thesis presents and compares a model-driven and data-driven approach for self-supervised road marking classification. Whereas the former leverages additional sensor modalities and domain knowledge, the latter employs state-of-the-art image-to-image translation techniques to synthesize training data. Thirdly, semantic segmentation is commonly performed in the front-facing perspective, which does not explicitly encode distances or directly link to the vehicle's action space. We tackle this problem by learning an improved bird's-eye-view mapping, called "boosted IPM", which aids scene graph generation in real-world scenarios.</p> <p>In addressing the above, we introduce scalable, self-supervised learning techniques by employing design principles such as transfer learning, leveraging domain knowledge, and data synthesis. Furthermore, we improve the robustness of the representations under wide-ranging environmental conditions by image restoration and learning appearance-invariant representations. The presented methodologies serve as a valuable starting point to devise effective and efficient representations for road scene understanding based on in-situ perception in real-world scenarios. </p>
spellingShingle	Robot vision Automated vehicles Bruls, TAH Effective representations for road scene understanding with scalable learning
title	Effective representations for road scene understanding with scalable learning
title_full	Effective representations for road scene understanding with scalable learning
title_fullStr	Effective representations for road scene understanding with scalable learning
title_full_unstemmed	Effective representations for road scene understanding with scalable learning
title_short	Effective representations for road scene understanding with scalable learning
title_sort	effective representations for road scene understanding with scalable learning
topic	Robot vision Automated vehicles
work_keys_str_mv	AT brulstah effectiverepresentationsforroadsceneunderstandingwithscalablelearning

Effective representations for road scene understanding with scalable learning

Similar Items