GENESIS-V2: inferring unordered object representations without iterative refinement

Advances in unsupervised learning of object-representations have culminated in the development of a broad range of methods for unsupervised object segmentation and interpretable object-centric scene generation. These methods, however, are limited to simulated and real-world datasets with limited vis...

Full description

Bibliographic Details
Main Authors: Engelcke, M, Parker Jones, OP, Posner, I
Format: Conference item
Language:English
Published: Curran Associates 2022
_version_ 1797113115155890176
author Engelcke, M
Parker Jones, OP
Posner, I
author_facet Engelcke, M
Parker Jones, OP
Posner, I
author_sort Engelcke, M
collection OXFORD
description Advances in unsupervised learning of object-representations have culminated in the development of a broad range of methods for unsupervised object segmentation and interpretable object-centric scene generation. These methods, however, are limited to simulated and real-world datasets with limited visual complexity. Moreover, object representations are often inferred using RNNs which do not scale well to large images or iterative refinement which avoids imposing an unnatural ordering on objects in an image but requires the a priori initialisation of a fixed number of object representations. In contrast to established paradigms, this work proposes an embedding-based approach in which embeddings of pixels are clustered in a differentiable fashion using a stochastic stick-breaking process. Similar to iterative refinement, this clustering procedure also leads to randomly ordered object representations, but without the need of initialising a fixed number of clusters a priori. This is used to develop a new model, GENESIS-v2, which can infer a variable number of object representations without using RNNs or iterative refinement. We show that GENESIS-v2 performs strongly in comparison to recent baselines in terms of unsupervised image segmentation and object-centric scene generation on established synthetic datasets as well as more complex real-world datasets.
first_indexed 2024-03-07T08:25:51Z
format Conference item
id oxford-uuid:358bad73-d7a4-4b00-a975-5609a57ffc14
institution University of Oxford
language English
last_indexed 2024-04-09T03:57:34Z
publishDate 2022
publisher Curran Associates
record_format dspace
spelling oxford-uuid:358bad73-d7a4-4b00-a975-5609a57ffc142024-04-03T12:20:12ZGENESIS-V2: inferring unordered object representations without iterative refinementConference itemhttp://purl.org/coar/resource_type/c_5794uuid:358bad73-d7a4-4b00-a975-5609a57ffc14EnglishSymplectic ElementsCurran Associates2022Engelcke, MParker Jones, OPPosner, IAdvances in unsupervised learning of object-representations have culminated in the development of a broad range of methods for unsupervised object segmentation and interpretable object-centric scene generation. These methods, however, are limited to simulated and real-world datasets with limited visual complexity. Moreover, object representations are often inferred using RNNs which do not scale well to large images or iterative refinement which avoids imposing an unnatural ordering on objects in an image but requires the a priori initialisation of a fixed number of object representations. In contrast to established paradigms, this work proposes an embedding-based approach in which embeddings of pixels are clustered in a differentiable fashion using a stochastic stick-breaking process. Similar to iterative refinement, this clustering procedure also leads to randomly ordered object representations, but without the need of initialising a fixed number of clusters a priori. This is used to develop a new model, GENESIS-v2, which can infer a variable number of object representations without using RNNs or iterative refinement. We show that GENESIS-v2 performs strongly in comparison to recent baselines in terms of unsupervised image segmentation and object-centric scene generation on established synthetic datasets as well as more complex real-world datasets.
spellingShingle Engelcke, M
Parker Jones, OP
Posner, I
GENESIS-V2: inferring unordered object representations without iterative refinement
title GENESIS-V2: inferring unordered object representations without iterative refinement
title_full GENESIS-V2: inferring unordered object representations without iterative refinement
title_fullStr GENESIS-V2: inferring unordered object representations without iterative refinement
title_full_unstemmed GENESIS-V2: inferring unordered object representations without iterative refinement
title_short GENESIS-V2: inferring unordered object representations without iterative refinement
title_sort genesis v2 inferring unordered object representations without iterative refinement
work_keys_str_mv AT engelckem genesisv2inferringunorderedobjectrepresentationswithoutiterativerefinement
AT parkerjonesop genesisv2inferringunorderedobjectrepresentationswithoutiterativerefinement
AT posneri genesisv2inferringunorderedobjectrepresentationswithoutiterativerefinement