Aaron Havens scite author profile

Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely µ synthesis. We build a connection between robust adversarial RL and µ synthesis, and develop a model-free version of the wellknown DK-iteration for solving state-feedback µ synthesis with static D-scaling. In the proposed algorithm, the K step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the D step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed modelfree algorithm. Our study sheds new light on the connections between adversarial RL and robust control.

show abstract

Online Robust Policy Learning in the Presence of Unknown Adversaries

Havens¹,

Jiang²,

Sarkar³

2018

Preprint

View full text Add to dashboard Cite

The growing prospect of deep reinforcement learning (DRL) being used in cyber-physical systems has raised concerns around safety and robustness of autonomous agents. Recent work on generating adversarial attacks have shown that it is computationally feasible for a bad actor to fool a DRL policy into behaving sub optimally. Although certain adversarial attacks with specific attack models have been addressed, most studies are only interested in off-line optimization in the data space (e.g., example fitting, distillation). This paper introduces a Meta-Learned Advantage Hierarchy (MLAH) framework that is attack model-agnostic and more suited to reinforcement learning, via handling the attacks in the decision space (as opposed to data space) and directly mitigating learned bias introduced by the adversary. In MLAH, we learn separate sub-policies (nominal and adversarial) in an online manner, as guided by a supervisory master agent that detects the presence of the adversary by leveraging the advantage function for the sub-policies. We demonstrate that the proposed algorithm enables policy learning with significantly lower bias as compared to the state-of-the-art policy learning approaches even in the presence of heavy state information attacks. We present algorithm analysis and simulation results using popular OpenAI Gym environments.

show abstract

Revisiting PGD Attacks for Stability Analysis of High-Dimensional Nonlinear Systems and Perception-Based Control

Havens

Kevian

Seiler

et al. 2023

IEEE Control Syst. Lett.

View full text Add to dashboard Cite

Many existing region-of-attraction (ROA) analysis tools find difficulty in addressing feedback systems with large-scale neural network (NN) policies and/or highdimensional sensing modalities such as cameras. In this letter, we tailor the projected gradient descent (PGD) attack method as a general-purpose ROA analysis tool for highdimensional nonlinear systems and end-to-end perceptionbased control. We show that the ROA analysis can be approximated as a constrained maximization problem such that PGD-based iterative methods can be directly applied. In the model-based setting, we show that the PGD updates can be efficiently performed using back-propagation. In the model-free setting, we propose a finite-difference PGD estimate which is general and only requires a black-box simulator for generating the trajectories of the closed-loop system given any initial state. Finally, we demonstrate the scalability and generality of our analysis tool on several numerical examples with large state dimensions or complex image observations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Aaron Havens

On Imitation Learning of Linear Control Policies: Enforcing Stability and Robustness Constraints via LMI Conditions

A Berry Picking Robot With A Hybrid Soft-Rigid Arm: Design and Task Space Control

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Online Robust Policy Learning in the Presence of Unknown Adversaries

Revisiting PGD Attacks for Stability Analysis of High-Dimensional Nonlinear Systems and Perception-Based Control

Contact Info

Product

Resources

About