On Learning and Learned Data Representation by Capsule Networks

Capsule networks (CapsNet) are recently proposed neural network models containing newly introduced processing layer, which are specialized in entity representation and discovery in images. CapsNet is motivated by a view of parse tree-like information processing mechanism and employs an iterative rou...

Full description

Bibliographic Details
Main Authors: Ancheng Lin, Jun Li, Zhenyuan Ma
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8692359/
Description
Summary:Capsule networks (CapsNet) are recently proposed neural network models containing newly introduced processing layer, which are specialized in entity representation and discovery in images. CapsNet is motivated by a view of parse tree-like information processing mechanism and employs an iterative routing operation dynamically determining connections between layers composed of capsule units, in which the information ascends through different levels of interpretations, from raw sensory observation to semantically meaningful entities represented by active capsules. The CapsNet architecture is plausible and has been proven to be effective in some image data processing tasks, the newly introduced routing operation is mainly required for determining the capsules' activation status during the forward pass. However, its influence on model fitting and the resulted representation is barely understood. In this work, we investigate the following: 1) how the routing affects the CapsNet model fitting; 2) how the representation using capsules helps discover global structures in data distribution, and; 3) how the learned data representation adapts and generalizes to new tasks. Our investigation yielded the results some of which have been mentioned in the original paper of CapsNet, they are: 1) the routing operation determines the certainty with which a layer of capsules pass information to the layer above and the appropriate level of certainty is related to the model fitness; 2) in a designed experiment using data with a known 2D structure, capsule representations enable a more meaningful 2D manifold embedding than neurons do in a standard convolutional neural network (CNN), and; 3) compared with neurons of the standard CNN, capsules of successive layers are less coupled and more adaptive to new data distribution.
ISSN:2169-3536