Stacked capsule autoencoders

Objects are composed of a set of geometrically organized parts. We introduce an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric relationships between parts to reason about objects. Since these relationships do not depend on the viewpoint, our model is robust to viewpoint cha...

Full description

Bibliographic Details
Main Authors: Kosiorek, AR, Sabour, S, Teh, YW, Hinton, GE, Miss Jo STAFFORD-TOLLEY
Format: Conference item
Language:English
Published: Nueral Information Processing Systems 2019
_version_ 1797083042607529984
author Kosiorek, AR
Sabour, S
Teh, YW
Hinton, GE
Miss Jo STAFFORD-TOLLEY
author_facet Kosiorek, AR
Sabour, S
Teh, YW
Hinton, GE
Miss Jo STAFFORD-TOLLEY
author_sort Kosiorek, AR
collection OXFORD
description Objects are composed of a set of geometrically organized parts. We introduce an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric relationships between parts to reason about objects. Since these relationships do not depend on the viewpoint, our model is robust to viewpoint changes. SCAE consists of two stages. In the first stage, the model predicts presences and poses of part templates directly from the image and tries to reconstruct the image by appropriately arranging the templates. In the second stage, the SCAE predicts parameters of a few object capsules, which are then used to reconstruct part poses. Inference in this model is amortized and performed by off-the-shelf neural encoders, unlike in previous capsule networks. We find that object capsule presences are highly informative of the object class, which leads to state-of-the-art results for unsupervised classification on SVHN (55%) and MNIST (98.7%).
first_indexed 2024-03-07T01:36:23Z
format Conference item
id oxford-uuid:95564c2c-5afb-46f8-b509-a60da6a15375
institution University of Oxford
language English
last_indexed 2024-03-07T01:36:23Z
publishDate 2019
publisher Nueral Information Processing Systems
record_format dspace
spelling oxford-uuid:95564c2c-5afb-46f8-b509-a60da6a153752022-03-26T23:45:31ZStacked capsule autoencodersConference itemhttp://purl.org/coar/resource_type/c_5794uuid:95564c2c-5afb-46f8-b509-a60da6a15375EnglishSymplectic ElementsNueral Information Processing Systems2019Kosiorek, ARSabour, STeh, YWHinton, GEMiss Jo STAFFORD-TOLLEYObjects are composed of a set of geometrically organized parts. We introduce an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric relationships between parts to reason about objects. Since these relationships do not depend on the viewpoint, our model is robust to viewpoint changes. SCAE consists of two stages. In the first stage, the model predicts presences and poses of part templates directly from the image and tries to reconstruct the image by appropriately arranging the templates. In the second stage, the SCAE predicts parameters of a few object capsules, which are then used to reconstruct part poses. Inference in this model is amortized and performed by off-the-shelf neural encoders, unlike in previous capsule networks. We find that object capsule presences are highly informative of the object class, which leads to state-of-the-art results for unsupervised classification on SVHN (55%) and MNIST (98.7%).
spellingShingle Kosiorek, AR
Sabour, S
Teh, YW
Hinton, GE
Miss Jo STAFFORD-TOLLEY
Stacked capsule autoencoders
title Stacked capsule autoencoders
title_full Stacked capsule autoencoders
title_fullStr Stacked capsule autoencoders
title_full_unstemmed Stacked capsule autoencoders
title_short Stacked capsule autoencoders
title_sort stacked capsule autoencoders
work_keys_str_mv AT kosiorekar stackedcapsuleautoencoders
AT sabours stackedcapsuleautoencoders
AT tehyw stackedcapsuleautoencoders
AT hintonge stackedcapsuleautoencoders
AT missjostaffordtolley stackedcapsuleautoencoders