2022 IEEE 61st Conference on Decision and Control (CDC) 2022
DOI: 10.1109/cdc51059.2022.9992959
|View full text |Cite
|
Sign up to set email alerts
|

Inverse-Inverse Reinforcement Learning. How to Hide Strategy from an Adversarial Inverse Reinforcement Learner

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(1 citation statement)
references
References 21 publications
0
1
0
Order By: Relevance
“…The Rational Communicative Social Actions (RCSA) framework integrates inverse planning and RSA to model communicative aspects of intimacy (Hung, Thomas, Radkani, Tenenbaum, & Saxe, 2022) and punishment (Radkani, Tenenbaum, & Saxe, 2022). Similarly, in the reinforcement learning community, “inverse reinforcement learning” methods (Arora & Doshi, 2021; Ng, Russell, & others, 2000; Ramachandran & Amir, 2007) can infer reward functions, which can then be applied to influence observers, for example, to make a robot's motion “legible” to humans (Dragan, 2015; Dragan, Lee, & Srinivasa, 2013; Hadfield‐Menell, Russell, Abbeel, & Dragan, 2016) or oppositely to strategically fool adversarial viewers about its true intentions (Pattanayak, Krishnamurthy, & Berry, 2022).…”
Section: Background: Bayesian Inverse Planningmentioning
confidence: 99%
“…The Rational Communicative Social Actions (RCSA) framework integrates inverse planning and RSA to model communicative aspects of intimacy (Hung, Thomas, Radkani, Tenenbaum, & Saxe, 2022) and punishment (Radkani, Tenenbaum, & Saxe, 2022). Similarly, in the reinforcement learning community, “inverse reinforcement learning” methods (Arora & Doshi, 2021; Ng, Russell, & others, 2000; Ramachandran & Amir, 2007) can infer reward functions, which can then be applied to influence observers, for example, to make a robot's motion “legible” to humans (Dragan, 2015; Dragan, Lee, & Srinivasa, 2013; Hadfield‐Menell, Russell, Abbeel, & Dragan, 2016) or oppositely to strategically fool adversarial viewers about its true intentions (Pattanayak, Krishnamurthy, & Berry, 2022).…”
Section: Background: Bayesian Inverse Planningmentioning
confidence: 99%