2019
DOI: 10.48550/arxiv.1906.06818
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stacked Capsule Autoencoders

Abstract: An object can be seen as a geometrically organized set of interrelated parts. A system that makes explicit use of these geometric relationships to recognize objects should be naturally robust to changes in viewpoint, because the intrinsic geometric relationships are viewpoint-invariant. We describe an unsupervised version of capsule networks, in which a neural encoder, which looks at all of the parts, is used to infer the presence and poses of object capsules. The encoder is trained by backpropagating through … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 17 publications
0
7
0
Order By: Relevance
“…To improve biological plausibility, all computations in our model are local and all units are connected to the same small, local set of other units throughout learning and inference, which matches early visual cortex, in which the lateral connections that follow natural image statistics are implemented anatomically [4,26,48,59]. This in contrast to other ideas that require flexible pointers to arbitrary locations and features [as discussed by 53] or capsules that flexibly encode different parts of the input [9,36,50,51]. Nonetheless, we employ contrastive learning objectives and backpropagation here, for which we do not provide a biologically plausible implementations.…”
Section: Discussionmentioning
confidence: 90%
“…To improve biological plausibility, all computations in our model are local and all units are connected to the same small, local set of other units throughout learning and inference, which matches early visual cortex, in which the lateral connections that follow natural image statistics are implemented anatomically [4,26,48,59]. This in contrast to other ideas that require flexible pointers to arbitrary locations and features [as discussed by 53] or capsules that flexibly encode different parts of the input [9,36,50,51]. Nonetheless, we employ contrastive learning objectives and backpropagation here, for which we do not provide a biologically plausible implementations.…”
Section: Discussionmentioning
confidence: 90%
“…A small body of work focuses on developing equivariant autoencoders. Several methods construct data and group-specific architectures to auto-encode data equivariantly, learning an equivariant representation in the process (Hinton et al, 2011;Kosiorek et al, 2019). Others use supervision to extract class-invariant and class-equivariant representations (Feige, 2022).…”
Section: Equivariant Representations Of Atomic Systemsmentioning
confidence: 99%
“…Capsule Network is first proposed in [22] and is improved in [7] and [12], which is designed for image features extraction. In general, Capsule Network can not only effectively fuse information from numerous elements into highly expressive representations without information loss, but also reveal the contributions from different elements to the representations by routing mechanism.…”
Section: Capsule Networkmentioning
confidence: 99%