2017
DOI: 10.48550/arxiv.1704.06888
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Time-Contrastive Networks: Self-Supervised Learning from Video

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
70
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 40 publications
(70 citation statements)
references
References 0 publications
0
70
0
Order By: Relevance
“…A third class of techniques that is able to perform imitation learning without requiring knowledge of actions includes those that first focus on learning a representation of the task and then use an RL method with a predefined surrogate reward over that representation. For example, Gupta et al (2017) have proposed an invariant feature space to transfer skills between agents with different embodiments, Liu et al (2017) have presented a network architecture which is capable of handling differences in viewpoints and contexts between the imitator and the demonstrator, and Sermanet et al (2017) have proposed a time-contrastive network which is invariant to both different embodiments and viewpoints. While these techniques represent significant advances in representation learning, each of them uses the same surrogate reward function, i.e., the proximity of the imitator's and demonstrator's encoded representation at each time step.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…A third class of techniques that is able to perform imitation learning without requiring knowledge of actions includes those that first focus on learning a representation of the task and then use an RL method with a predefined surrogate reward over that representation. For example, Gupta et al (2017) have proposed an invariant feature space to transfer skills between agents with different embodiments, Liu et al (2017) have presented a network architecture which is capable of handling differences in viewpoints and contexts between the imitator and the demonstrator, and Sermanet et al (2017) have proposed a time-contrastive network which is invariant to both different embodiments and viewpoints. While these techniques represent significant advances in representation learning, each of them uses the same surrogate reward function, i.e., the proximity of the imitator's and demonstrator's encoded representation at each time step.…”
Section: Related Workmentioning
confidence: 99%
“…• Time Contrastive Networks (TCN) (Sermanet et al, 2017): TCNs use a triplet loss to train a neural network to learn an encoded form of the task at each time step. This loss function brings the states that occur in a small time-window closer together in the embedding space and pushes the ones from distant time-steps far apart.…”
Section: Experimental Setup and Implementation Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…Self-supervised learning from sequences. Previous work in contrastive learning for sequential data often leverages a slowness assumption to use nearby samples as positive examples and farther samples as negative examples (Oord et al, 2018;Sermanet et al, 2018;Dwibedi et al, 2019;Le-Khac et al, 2020;Banville et al, 2020). Contrastive predictive coding (CPC) (Oord et al, 2018) builds upon the idea of temporal contrastive learning by building an AR-model that predicts future points given previous observed timesteps.…”
Section: Related Workmentioning
confidence: 99%
“…Temporal shift: As in previous work in temporal contrastive learning (Oord et al, 2018;Sermanet et al, 2018;Dwibedi et al, 2019;Le-Khac et al, 2020;Banville et al, 2020), we can use nearby samples as positive examples for one another.…”
Section: B3 Augmentations For Neural Datamentioning
confidence: 99%