Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

Brown, Daniel S.; Niekum, Scott

doi:10.1609/aaai.v32i1.11755

Cited by 20 publications

(6 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since LfD utilizes expert demonstrations, the robot can be better incentivized to stay within safe or relevant regions of the state space, especially when compared with techniques that require significant exploration, such as reinforcement learning. This is because demonstrations provide a way to assess the safety or risk associated with regions of the state space (e.g., [196][197][198]. Furthermore, several LfD methods provide and utilize measures of uncertainty associated with different parts of the state space (e.g., 62, 81, 100), enabling communication of the system's confidence to the user.…”

Section: Safe Learningmentioning

confidence: 99%

Recent Advances in Robot Learning from Demonstration

Ravichandar

Polydoros

Chernova

et al. 2020

Annu. Rev. Control Robot. Auton. Syst.

500

219

View full text Add to dashboard Cite

In the context of robotics and automation, learning from demonstration (LfD) is the paradigm in which robots acquire new skills by learning to imitate an expert. The choice of LfD over other robot learning methods is compelling when ideal behavior can be neither easily scripted (as is done in traditional robot programming) nor easily defined as an optimization problem, but can be demonstrated. While there have been multiple surveys of this field in the past, there is a need for a new one given the considerable growth in the number of publications in recent years. This review aims to provide an overview of the collection of machine-learning methods used to enable a robot to learn from and imitate a teacher. We focus on recent advancements in the field and present an updated taxonomy and characterization of existing methods. We also discuss mature and emerging application areas for LfD and highlight the significant challenges that remain to be overcome both in theory and in practice.

show abstract

Section: Safe Learningmentioning

confidence: 99%

Recent Advances in Robot Learning from Demonstration

Ravichandar

Polydoros

Chernova

et al. 2020

Annu. Rev. Control Robot. Auton. Syst.

500

219

View full text Add to dashboard Cite

show abstract

“…In contrast, the ML-IRL is based on maximum likelihood estimation, which cannot incorporate prior knowledge and handle uncertainty. IRL with the Bayesian optimization method has been used to learn driving strategies [54], mobile robot navigation [55,56], and robot demonstrative learning [57] with good performance. Hierarchical BIRL extended on the original basis outperforms MaxEnt-IRL in cab driver route selection based on maps and GPS data [58].…”

Section: Bayesian Optimization Methodsmentioning

confidence: 99%

Data-Driven Policy Learning Methods from Biological Behavior: A Systematic Review

Wang,

Hayashibe,

Owaki

2024

Applied Sciences

View full text Add to dashboard Cite

Policy learning enables agents to learn how to map states to actions, thus enabling adaptive and flexible behavioral generation in complex environments. Policy learning methods are fundamental to reinforcement learning techniques. However, as problem complexity and the requirement for motion flexibility increase, traditional methods that rely on manual design have revealed their limitations. Conversely, data-driven policy learning focuses on extracting strategies from biological behavioral data and aims to replicate these behaviors in real-world environments. This approach enhances the adaptability of agents to dynamic substrates. Furthermore, this approach has been extensively applied in autonomous driving, robot control, and interpretation of biological behavior. In this review, we survey developments in data-driven policy-learning algorithms over the past decade. We categorized them into the following three types according to the purpose of the method: (1) imitation learning (IL), (2) inverse reinforcement learning (IRL), and (3) causal policy learning (CPL). We describe the classification principles, methodologies, progress, and applications of each category in detail. In addition, we discuss the distinct features and practical applications of these methods. Finally, we explore the challenges these methods face and prospective directions for future research.

show abstract

“…Furthermore, Brown et al [54] construct a sampling-based Bayesian IRL model, which utilizes expert trajectories to calculate practical high-confidence upper bounds on the αworst-case difference in expected return under the unseen scenarios without a reward function. Palan et al [55] propose DemPref model, which utilizes the expert trajectory to learn a coarse reward function, the trajectory is used to ground the (active) query generation process, to improve the quality of the generated queries.…”

Section: A Imitation Learningmentioning

confidence: 99%

Hierarchical Interpretable Imitation Learning for End-to-End Autonomous Driving

Teng

Chen

et al. 2023

IEEE Trans. Intell. Veh.

View full text Add to dashboard Cite

Thanks to the augmented convenience, safety advantages, and potential commercial value, Intelligent vehicles (IVs) have attracted wide attention throughout the world. Although a few autonomous driving unicorns assert that IVs will be commercially deployable by 2025, their implementation is still restricted to small-scale validation due to various issues, among which precise computation of control commands or trajectories by planning methods remains a prerequisite for IVs. This paper aims to review state-of-the-art planning methods, including pipeline planning and end-to-end planning methods. In terms of pipeline methods, a survey of selecting algorithms is provided along with a discussion of the expansion and optimization mechanisms, whereas in end-to-end methods, the training approaches and verification scenarios of driving tasks are points of concern. Experimental platforms are reviewed to facilitate readers in selecting suitable training and validation methods. Finally, the current challenges and future directions are discussed. The sideby-side comparison presented in this survey not only helps to gain insights into the strengths and limitations of the reviewed methods but also assists with system-level design choices.

show abstract

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

Cited by 20 publications

References 27 publications

Recent Advances in Robot Learning from Demonstration

Recent Advances in Robot Learning from Demonstration

Data-Driven Policy Learning Methods from Biological Behavior: A Systematic Review

Hierarchical Interpretable Imitation Learning for End-to-End Autonomous Driving

Contact Info

Product

Resources

About