2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00661
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Visual Attention and Invariance for Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 21 publications
(6 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…Some works leverage inductive biases, for example, object-oriented architecture (Kansky et al 2017;Yi et al 2022) and disentangled representation (Higgins et al 2017;Peng et al 2022). Some methods want to eliminate misleading information like temporal information (Raileanu et al 2021;Guo et al 2021) or background (Wang et al 2021). Other methods define the metric of invariance for optimizing the representation, like Bisimulation metrics (Zhang et al 2021) and policy similarity (Agarwal et al 2021).…”
Section: Related Workmentioning
confidence: 99%
“…Some works leverage inductive biases, for example, object-oriented architecture (Kansky et al 2017;Yi et al 2022) and disentangled representation (Higgins et al 2017;Peng et al 2022). Some methods want to eliminate misleading information like temporal information (Raileanu et al 2021;Guo et al 2021) or background (Wang et al 2021). Other methods define the metric of invariance for optimizing the representation, like Bisimulation metrics (Zhang et al 2021) and policy similarity (Agarwal et al 2021).…”
Section: Related Workmentioning
confidence: 99%
“…Zhang et al (2020) use bisimulation metrics to quantify behavioral similarity between states and learn robust task-relevant representations. Wang et al (2021b) extract, without supervision, the visual foreground to provide background invariant inputs to the policy learner.…”
Section: Related Workmentioning
confidence: 99%
“…DVRL [35], PlaNet [27], and SLAC [44] predict future observations and rewards given current observation and action. Transporter [41] and VAI [77] train an unsupervised keypoint detector to discovery critical objects in image for control. After the agent is deployed, SSL can be used to continuously improve the policy [28].…”
Section: Related Workmentioning
confidence: 99%