Robotics: Science and Systems XVI 2020
DOI: 10.15607/rss.2020.xvi.101
|View full text |Cite
|
Sign up to set email alerts
|

Learning Task-Driven Control Policies via Information Bottlenecks

Abstract: This paper presents a reinforcement learning approach to synthesizing task-driven control policies for robotic systems equipped with rich sensory modalities (e.g., vision or depth). Standard reinforcement learning algorithms typically produce policies that tightly couple control actions to the entirety of the system's state and rich sensor observations. As a consequence, the resulting policies can often be sensitive to changes in taskirrelevant portions of the state or observations (e.g., changing background c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…The first example we consider is the lava problem (Figure 2) (Cassandra et al, 1994;Florence, 2017;Pacelli and Majumdar, 2020) from the POMDP literature.…”
Section: Lava Problemmentioning
confidence: 99%
See 1 more Smart Citation
“…The first example we consider is the lava problem (Figure 2) (Cassandra et al, 1994;Florence, 2017;Pacelli and Majumdar, 2020) from the POMDP literature.…”
Section: Lava Problemmentioning
confidence: 99%
“…The first example we consider is the lava problem (Figure 2) (Cassandra et al, 1994; Florence, 2017; Pacelli and Majumdar, 2020) from the POMDP literature.
Figure 2.An illustration of the lava problem. The robot needs to navigate to a goal without falling into the lava (using a noisy sensor).
…”
Section: Examplesmentioning
confidence: 99%
“…Literature Review: Recent work has applied information bottleneck theory [32] to build controllers that focus on actionable, task-relevant visual inputs for robust, generalizable navigation and grasping policies [27,26,29]. In contrast, we introduce a novel algorithm for co-designing communication and machine perception, which uses pre-trained task modules to learn salient, efficiently-computable representations.…”
Section: Train-time Onlymentioning
confidence: 99%
“…To capture intricate non-linear embeddings, we turn to recent works that learn latent representations from data [27]. Robots can learn low-dimensional models of states [40], dynamics [49,50], movement primitives [39], trajectories [14], plans [32], policies [17], skills [42], and action representations for reinforcement learning [11].…”
Section: Related Workmentioning
confidence: 99%