In this work, Ni-doped manganite perovskite oxides (La0.8Sr0.2Mn(1-x)Ni(x)O3, x = 0.2 and 0.4) and undoped La0.8Sr0.2MnO3 were synthesized via a general and facile sol-gel route and used as bifunctional catalysts for oxygen cathode in rechargeable lithium-air batteries. The structural and compositional characterization results showed that the obtained La0.8Sr0.2Mn(1-x)Ni(x)O3 (x = 0.2 and 0.4) contained more oxygen vacancies than did the undoped La0.8Sr0.2MnO3 as well as a certain amount of Ni(3+) (eg = 1) on their surface. The Ni-doped La0.8Sr0.2Mn(1-x)Ni(x)O3 (x = 0.2 and 0.4) was provided with higher bifunctional catalytic activities than that of the undoped La0.8Sr0.2MnO3. In particular, the La0.8Sr0.2Mn0.6Ni0.4O3 had a lower total over potential between the oxygen evolution reaction and the oxygen reduction reaction than that of the La0.8Sr0.2MnO3, and the value is even comparable to that of the commercial Pt/C yet is provided with a much reduced cost. In the lithium-air battery, oxygen cathodes containing the La0.8Sr0.2Mn0.6Ni0.4O3 catalyst delivered the optimized electrochemical performance in terms of specific capacity and cycle life, and a reasonable reaction mechanism was given to explain the improved performance.
Recent works on ride-sharing order dispatching have highlighted the importance of taking into account both the spatial and temporal dynamics in the dispatching process for improving the transportation system efficiency. At the same time, deep reinforcement learning has advanced to the point where it achieves superhuman performance in a number of fields. In this work, we propose a deep reinforcement learning based solution for order dispatching and we conduct large scale online A/B tests on DiDi's ride-dispatching platform to show that the proposed method achieves significant improvement on both total driver income and user experience related metrics. In particular, we model the ride dispatching problem as a Semi Markov Decision Process to account for the temporal aspect of the dispatching actions. To improve the stability of the value iteration with nonlinear function approximators like neural networks, we propose Cerebellar Value Networks (CVNet) with a novel distributed state representation layer. We further derive a regularized policy evaluation scheme for CVNet that penalizes large Lipschitz constant of the value network for additional robustness against adversarial perturbation and noises. Finally, we adapt various transfer learning methods to CVNet for increased learning adaptability and efficiency across multiple cities. We conduct extensive offline simulations based on real dispatching data as well as online AB tests through the DiDi's platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods. We finally show that the performance can be further improved through the efficient use of transfer learning.
Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent's performance, relative to learning unaided, and 2) allow the target agent to outperform the source agent. Our approach leverages source agent demonstrations, removing any requirements on the source agent's learning algorithm or representation. The target agent then estimates the source agent's policy and improves upon it. The key contribution of this work is to show that leveraging the target agent's uncertainty in the source agent's policy can significantly improve learning in two complex simulated domains, Keepaway and Mario.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.