Machine learning approaches often lack safety guarantees, which are often a key requirement in real-world tasks. This paper addresses the lack of safety guarantees by extending reinforcement learning with a safety layer that restricts the action space to the subspace of safe actions. We demonstrate the proposed approach using lane changing in autonomous driving. To distinguish safe actions from unsafe ones, we compare planned motions with the set of possible occupancies of traffic participants generated by set-based predictions. In situations where no safe action exists, a verified failsafe controller is executed. We used real-world highway traffic data to train and test the proposed approach. The evaluation result shows that the proposed approach trains agents that do not cause collisions during training and deployment.
While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems. In this work, we address this issue by a safety shield for nonlinear continuous systems that solve reach-avoid tasks. Our safety shield prevents applying potentially unsafe actions from a reinforcement learning agent by projecting the proposed action to the closest safe action. This approach is called action projection and is implemented via mixed-integer optimization. The safety constraints for action projection are obtained by applying parameterized reachability analysis using polynomial zonotopes, which enables to accurately capture the nonlinear effects of the actions on the system. In contrast to other state-of-the-art approaches for action projection, our safety shield can efficiently handle input constraints and dynamic obstacles, eases incorporation of the spatial robot dimensions into the safety constraints, guarantees robust safety despite process noise and measurement errors, and is well suited for high-dimensional systems, as we demonstrate on several challenging benchmark systems.
Autonomous vessels have to adhere to marine traffic rules to ensure traffic safety and reduce the liability of manufacturers. However, autonomous systems can only evaluate rule compliance if rules are formulated in a precise and mathematical way. This paper formalizes marine traffic rules from the Convention on the International Regulations for Preventing Collisions at Sea (COLREGS) using temporal logic. In particular, the collision prevention rules between two power-driven vessels are delineated. The formulation is based on modular predicates and adjustable parameters. We evaluate the formalized rules in three US coastal areas for over 1,200 vessels using real marine traffic data.
Reinforcement learning (RL) methods have gained popularity in the field of motion planning for autonomous vehicles due to their success in robotics and computer games. However, no existing work enables researchers to conveniently compare different underlying the Markov decision processes (MDPs). To address this issue, we present CommonRoad-RLan open-source toolbox to train and evaluate RL-based motion planners for autonomous vehicles. Configurability, modularity, and stability of CommonRoad-RL simplify comparing different MDPs. This is demonstrated by comparing agents trained with different rewards, action spaces, and vehicle models on a real-world highway dataset. Our toolbox is available at commonroad.in.tum.de.
Ensuring safety of reinforcement learning (RL) algorithms is crucial for many real-world tasks. However, vanilla RL does not guarantee safety for an agent. In recent years, several methods have been proposed to provide safety guarantees for RL. To the best of our knowledge, there is no comprehensive comparison of these provably safe RL methods. We therefore introduce a categorization for existing provably safe RL methods, and present the theoretical foundations for both continuous and discrete action spaces. Additionally, we evaluate provably safe RL on an inverted pendulum. In the experiments, it is shown that indeed only provably safe RL methods guarantee safety.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.