Evaluating the worst-case performance of a reinforcement learning (RL) agent under the strongest/optimal adversarial perturbations on state observations (within some constraints) is crucial for understanding the robustness of RL agents. However, finding the optimal adversary is challenging, in terms of both whether we can find the optimal attack and how efficiently we can find it. Existing works on adversarial RL either use heuristics-based methods that may not find the strongest adversary, or directly train an RL-based adversary by treating the agent as a part of the environment, which can find the optimal adversary but may become intractable in a large state space. In this paper, we propose a novel attacking algorithm which has an RL-based "director" searching for the optimal policy perturbation, and an "actor" crafting state perturbations following the directions from the director (i.e. the actor executes targeted attacks). Our proposed algorithm, PA-AD, is theoretically optimal against an RL agent and significantly improves the efficiency compared with prior RL-based works in environments with large or pixel state spaces. Empirical results show that our proposed PA-AD universally outperforms state-of-the-art attacking methods in a wide range of environments. Our method can be easily applied to any RL algorithms to evaluate and improve their robustness.Preprint. Under review.
In many reinforcement learning (RL) applications, the observation space is specified by human developers and restricted by physical realizations, and may thus be subject to dramatic changes over time (e.g. increased number of observable features). However, when the observation space changes, the previous policy will likely fail due to the mismatch of input features, and another policy must be trained from scratch, which is inefficient in terms of computation and sample complexity. Following theoretical insights, we propose a novel algorithm which extracts the latent-space dynamics in the source task, and transfers the dynamics model to the target task to use as a model-based regularizer. Our algorithm works for drastic changes of observation space (e.g. from vector-based observation to image-based observation), without any inter-task mapping or any prior knowledge of the target task. Empirical results show that our algorithm significantly improves the efficiency and stability of learning in the target task. * The work was done while the author was an intern at Unity Technologies.
Crystalline lithium fluoride (LiF) has been intensively pursued as potential alternative solid electrolytes (SEs) owing to its excellent chemical and electrochemical oxidation stability, and good deformability. However, due to its low ion conductivity, LiF is still challenging for practical SE applications. Herein, Li-Zr-F composite-based SE by liquid-mediated synthesis is proposed to be studied. methanol (CH<sub>3</sub>OH) was mainly evaluated as a liquid-mediated precursor for synthesizing Li-Zr-F composites under the stoichiometric proportion of LiF and ZrF4 (2:1 and 2:0.8) and a subsequent annealing process at 25°C/150°C, 50°C/150°C, and 70°C/150°C, respectively. X-ray diffraction results revealed that the Li-Zr-F composites could be crystallized in the three main types of phase formations, including Li<sub>2</sub>ZrF<sub>6</sub> ( ), Li<sub>2</sub>ZrF<sub>6</sub> ( ), and Li<sub>4</sub>ZrF<sub>8</sub> ( ) octahedron structures. In addition, the effect of cation stack sublattice synthesized by methanol mediator on the ion conduction of Li-Zr-F composites was investigated by using electrochemical impedance spectroscopy (EIS). Through the Zr<sup>4+</sup>-substitution, Li<sub>2</sub>ZrF<sub>6</sub> ( )-based SE exhibited the highest ion conduction which was increased to 2.40 × 10<sup>-8</sup> S/cm and 3.89 × 10<sup>-8</sup> S/cm under the stoichiometric proportion of LiF and ZrF<sub>4</sub> 2:0.8 at a dried temperature of 50°C/150°C with, respectively. A 0.21 eV activation energy ( ) was achieved for a battery with Li<sub>2</sub>ZrF<sub>6</sub> ( )-based SE. Meanwhile, LiF exhibited up to 0.78 eV leading to a low kinetic rate for ion diffusion. These results implied that Li<sub>2</sub>ZrF<sub>6 </sub>( )-based SE was successfully synthesized under the optimal condition of CH<sub>3</sub>OH-50°C/150°C which could improve the ion-conductivity of LiF.
Lower alcohols (C1−C7) are closely related to our life, and some of them are harmful to our body health or not. For example, the methanol in liquor is harmful to...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.