Efficient object detection and discovery for real-world robotics applications

<p>Robots need to interact with the world to complete practical tasks. This typically requires the interaction with objects, rendering object perception a fundamental capability in robotics. Real-world robotics applications provide particular challenges and opportunities for object perception....

Full description

Bibliographic Details
Main Author: Engelcke, M
Other Authors: Posner, I
Format: Thesis
Language:English
Published: 2020
Subjects:
_version_ 1826315698810388480
author Engelcke, M
author2 Posner, I
author_facet Posner, I
Engelcke, M
author_sort Engelcke, M
collection OXFORD
description <p>Robots need to interact with the world to complete practical tasks. This typically requires the interaction with objects, rendering object perception a fundamental capability in robotics. Real-world robotics applications provide particular challenges and opportunities for object perception. It is possible to utilise data from a variety of sensing modalities, such as cameras and lidar devices, but physical hardware constraints demand careful management of computational resources. Similarly, the need to respond to changing environmental conditions and task definitions requires robots to learn efficiently from limited human supervision. Efficiency in terms of computation and supervision are thus essential. To these ends, this thesis makes several contributions. Firstly, we develop Vote3Deep which is - to the best of our knowledge - the first method that performs efficient, native 3D object detection in real-world lidar point clouds with convolutional neural networks (CNNs). This is achieved by maintaining the natural sparsity inherent in the data throughout the CNN hierarchy and by leveraging an efficient, sparse convolution operation. Secondly, we present GENESIS which is - to the best of our knowledge - the first object-centric generative model that learns to decompose rendered images of simulated 3D environments into object-like components without supervision, while also being able to generate entire coherent scenes in an object-centric fashion. This is facilitated by an autoregressive prior that captures correlations between objects in the generative model, offering the potential of utilising GENESIS as an object-centric ``world model'' for efficient skill acquisition. Given that we want to apply object-centric generative models in real-world applications, we finally develop GENESIS++ which utilises a novel algorithm for fully-differentiable clustering of instance embeddings to facilitate the symmetric inference of object representations. We show that GENESIS++ outperforms competitive baselines both on well-established simulated datasets and on challenging real-world datasets collected in the context of robotics applications.</p>
first_indexed 2024-03-06T19:01:50Z
format Thesis
id oxford-uuid:13cefafb-89e5-42dd-990e-822ecd9224ba
institution University of Oxford
language English
last_indexed 2024-12-09T03:30:49Z
publishDate 2020
record_format dspace
spelling oxford-uuid:13cefafb-89e5-42dd-990e-822ecd9224ba2024-12-01T14:10:59ZEfficient object detection and discovery for real-world robotics applicationsThesishttp://purl.org/coar/resource_type/c_db06uuid:13cefafb-89e5-42dd-990e-822ecd9224baMachine learningRoboticsComputer visionEnglishHyrax Deposit2020Engelcke, MPosner, IVedaldi, AUrtasun, R<p>Robots need to interact with the world to complete practical tasks. This typically requires the interaction with objects, rendering object perception a fundamental capability in robotics. Real-world robotics applications provide particular challenges and opportunities for object perception. It is possible to utilise data from a variety of sensing modalities, such as cameras and lidar devices, but physical hardware constraints demand careful management of computational resources. Similarly, the need to respond to changing environmental conditions and task definitions requires robots to learn efficiently from limited human supervision. Efficiency in terms of computation and supervision are thus essential. To these ends, this thesis makes several contributions. Firstly, we develop Vote3Deep which is - to the best of our knowledge - the first method that performs efficient, native 3D object detection in real-world lidar point clouds with convolutional neural networks (CNNs). This is achieved by maintaining the natural sparsity inherent in the data throughout the CNN hierarchy and by leveraging an efficient, sparse convolution operation. Secondly, we present GENESIS which is - to the best of our knowledge - the first object-centric generative model that learns to decompose rendered images of simulated 3D environments into object-like components without supervision, while also being able to generate entire coherent scenes in an object-centric fashion. This is facilitated by an autoregressive prior that captures correlations between objects in the generative model, offering the potential of utilising GENESIS as an object-centric ``world model'' for efficient skill acquisition. Given that we want to apply object-centric generative models in real-world applications, we finally develop GENESIS++ which utilises a novel algorithm for fully-differentiable clustering of instance embeddings to facilitate the symmetric inference of object representations. We show that GENESIS++ outperforms competitive baselines both on well-established simulated datasets and on challenging real-world datasets collected in the context of robotics applications.</p>
spellingShingle Machine learning
Robotics
Computer vision
Engelcke, M
Efficient object detection and discovery for real-world robotics applications
title Efficient object detection and discovery for real-world robotics applications
title_full Efficient object detection and discovery for real-world robotics applications
title_fullStr Efficient object detection and discovery for real-world robotics applications
title_full_unstemmed Efficient object detection and discovery for real-world robotics applications
title_short Efficient object detection and discovery for real-world robotics applications
title_sort efficient object detection and discovery for real world robotics applications
topic Machine learning
Robotics
Computer vision
work_keys_str_mv AT engelckem efficientobjectdetectionanddiscoveryforrealworldroboticsapplications