2017 IEEE International Conference on Computer Vision (ICCV) 2017
DOI: 10.1109/iccv.2017.628
|View full text |Cite
|
Sign up to set email alerts
|

Representation Learning by Learning to Count

Abstract: We introduce a novel method for representation learning that uses an artificial supervision signal based on counting visual primitives. This supervision signal is obtained from an equivariance relation, which does not require any manual annotation. We relate transformations of images to transformations of the representations. More specifically, we look for the representation that satisfies such relation rather than the transformations that match a given representation. In this paper, we use two image transform… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
262
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 362 publications
(262 citation statements)
references
References 39 publications
0
262
0
Order By: Relevance
“…Unsupervised learning through learning representation space that successfully reconstructs samples is widely used for a variety of tasks, including classification [26], denosing [57], and in-painting [43]. Conventional methods for unsupervised representation learning are usually based on a pretext task such as reconstruction of static images [38] or videos [59]. Learning to reconstruct data was used for tasks like de-nsoing [57], in-painting [42], image refinement for defense against adversarial example [53], and for one-class classifiers [46,51,52].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Unsupervised learning through learning representation space that successfully reconstructs samples is widely used for a variety of tasks, including classification [26], denosing [57], and in-painting [43]. Conventional methods for unsupervised representation learning are usually based on a pretext task such as reconstruction of static images [38] or videos [59]. Learning to reconstruct data was used for tasks like de-nsoing [57], in-painting [42], image refinement for defense against adversarial example [53], and for one-class classifiers [46,51,52].…”
Section: Related Workmentioning
confidence: 99%
“…Other types of pretext tasks proposed for unsupervised learning include understanding the correct order of video frames [6,36] or predicting the spatial relation between image patches [12], e.g., jigsaw puzzle solving as a pretext task was exploited by Noorozi and Favaro [37]. In another work, Noroozi et al [38] proposed to train an unsupervised model by counting the primitive elements of images. Pathak et al [41] proposed a model to segment an image into foreground and background.…”
Section: Related Workmentioning
confidence: 99%
“…The empirical analysis of [33,20] suggests that the performance of recent deep networks is not yet saturated with respect to the size of training data. For this reason, learning methods from semi-supervised learning [42,39,33,20] to unsupervised learning [1,7,58,38] are attracting attention along with weakly-labeled or unlabeled large-scale data.…”
Section: Introductionmentioning
confidence: 99%
“…A very common solution is that of measuring the source prediction uncertainty on the target data with an entropy loss which is minimized during training [16,18]. A recent stream of works has introduced techniques to extract self-supervisory signals from unlabeled data as the patch relative position [9,22], counting primitives [23], or image coloring [33]. They capture invariances and regularities that allow to train models useful as fine-tuning priors, and those information appear also independent from the specific visual domain of the data from which they are obtained.…”
Section: Related Workmentioning
confidence: 99%