Learning Task-Driven Control Policies via Information Bottlenecks

Pacelli, Vincent; Majumdar, Anirudha

doi:10.15607/rss.2020.xvi.101

Cited by 9 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The first example we consider is the lava problem (Figure 2) (Cassandra et al, 1994;Florence, 2017;Pacelli and Majumdar, 2020) from the POMDP literature.…”

Section: Lava Problemmentioning

confidence: 99%

See 1 more Smart Citation

Fundamental limits for sensor-based robot control

Majumdar,

Mei,

Pacelli

2023

The International Journal of Robotics Research

Self Cite

View full text Add to dashboard Cite

Our goal is to develop theory and algorithms for establishing fundamental limits on performance imposed by a robot’s sensors for a given task. In order to achieve this, we define a quantity that captures the amount of task-relevant information provided by a sensor. Using a novel version of the generalized Fano's inequality from information theory, we demonstrate that this quantity provides an upper bound on the highest achievable expected reward for one-step decision-making tasks. We then extend this bound to multi-step problems via a dynamic programming approach. We present algorithms for numerically computing the resulting bounds, and demonstrate our approach on three examples: (i) the lava problem from the literature on partially observable Markov decision processes, (ii) an example with continuous state and observation spaces corresponding to a robot catching a freely-falling object, and (iii) obstacle avoidance using a depth sensor with non-Gaussian noise. We demonstrate the ability of our approach to establish strong limits on achievable performance for these problems by comparing our upper bounds with achievable lower bounds (computed by synthesizing or learning concrete control policies).

show abstract

“…The first example we consider is the lava problem (Figure 2) (Cassandra et al, 1994;Florence, 2017;Pacelli and Majumdar, 2020) from the POMDP literature.…”

Section: Lava Problemmentioning

confidence: 99%

“…The first example we consider is the lava problem (Figure 2) (Cassandra et al, 1994; Florence, 2017; Pacelli and Majumdar, 2020) from the POMDP literature.

Figure 2.An illustration of the lava problem. The robot needs to navigate to a goal without falling into the lava (using a noisy sensor).…”

Section: Examplesmentioning

confidence: 99%

Fundamental limits for sensor-based robot control

Majumdar,

Mei,

Pacelli

2023

The International Journal of Robotics Research

Self Cite

View full text Add to dashboard Cite

show abstract

“…Literature Review: Recent work has applied information bottleneck theory [32] to build controllers that focus on actionable, task-relevant visual inputs for robust, generalizable navigation and grasping policies [27,26,29]. In contrast, we introduce a novel algorithm for co-designing communication and machine perception, which uses pre-trained task modules to learn salient, efficiently-computable representations.…”

Section: Train-time Onlymentioning

confidence: 99%

Co-Design of Communication and Machine Inference for Cloud Robotics

Nakanoya

Chinchali

Anemogiannis

et al. 2021

Robotics: Science and Systems XVII

View full text Add to dashboard Cite

Today, even the most compute-and-power constrained robots can measure complex, high data-rate video and LIDAR sensory streams. Often, such robots, ranging from lowpower drones to space and subterranean rovers, need to transmit high-bitrate sensory data to a remote compute server if they are uncertain or cannot scalably run complex perception or mapping tasks locally. However, today's representations for sensory data are mostly designed for human, not robotic, perception and thus often waste precious compute or wireless network resources to transmit unimportant parts of a scene that are unnecessary for a high-level robotic task. This paper presents an algorithm to learn task-relevant representations of sensory data that are codesigned with a pre-trained robotic perception model's ultimate objective. Our algorithm aggressively compresses robotic sensory data by up to 11 × more than competing methods. Further, it achieves high accuracy and robust generalization on diverse tasks including Mars terrain classification with low-power deep learning accelerators, neural motion planning, and environmental timeseries classification.

show abstract

“…To capture intricate non-linear embeddings, we turn to recent works that learn latent representations from data [27]. Robots can learn low-dimensional models of states [40], dynamics [49,50], movement primitives [39], trajectories [14], plans [32], policies [17], skills [42], and action representations for reinforcement learning [11].…”

Section: Related Workmentioning

confidence: 99%

Learning Latent Actions to Control Assistive Robots

Losey¹,

Jeon²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

Assistive robot arms enable people with disabilities to conduct everyday tasks on their own. These arms are dexterous and high-dimensional ; however, the interfaces people must use to control their robots are low-dimensional. Consider teleoperating a 7-DoF robot arm with a 2-DoF joystick. The robot is helping you eat dinner, and currently you want to cut a piece of tofu. Today's robots assume a pre-defined mapping between joystick inputs and robot actions: in one mode the joystick controls the robot's motion in the x-y plane, in another mode the joystick controls the robot's z-yaw motion, and so on. But this mapping misses out on the task you are trying to perform! Ideally, one joystick axis should control how the robot stabs the tofu, and the other axis should control different cutting motions. Our insight is that we can achieve intuitive, userfriendly control of assistive robots by embedding the robot's high-dimensional actions into low-dimensional and human-controllable latent actions. We divide this process into three parts. First, we explore models for learning latent actions from offline task demonstrations, and formalize the properties that latent actions should satisfy. Next, we combine learned latent actions with autonomous robot assistance to help the user reach and maintain their high-level goals. Finally, we learn a personalized alignment model between joystick inputs and latent actions. We evaluate our resulting approach in four user studies where non-disabled participants reach

show abstract

Learning Task-Driven Control Policies via Information Bottlenecks

Cited by 9 publications

References 19 publications

Fundamental limits for sensor-based robot control

Fundamental limits for sensor-based robot control

Co-Design of Communication and Machine Inference for Cloud Robotics

Learning Latent Actions to Control Assistive Robots

Contact Info

Product

Resources

About