Erwan Lecarpentier scite author profile

Erwan Lecarpentier

3Publications

11Citation Statements Received

21Citation Statements Given

How they've been cited

How they cite others

Affiliations

Université de Toulouse, Office National d'Études et de Recherches Aérospatiales, National Higher French Institute of Aeronautics and Space

Publications

Order By: Most citations

Open Loop Execution of Tree-Search Algorithms

Lecarpentier

Infantes

Lesire

et al. 2018

View full text Add to dashboard Cite

In the context of tree-search stochastic planning algorithms where a generative model is available, we consider on-line planning algorithms building trees in order to recommend an action. We investigate the question of avoiding re-planning in subsequent decision steps by directly using sub-trees as action recommender. Firstly, we propose a method for open loop control via a new algorithm taking the decision of re-planning or not at each time step based on an analysis of the statistics of the sub-tree. Secondly, we show that the probability of selecting a suboptimal action at any depth of the tree can be upper bounded and converges towards zero. Moreover, this upper bound decays in a logarithmic way between subsequent depths. This leads to a distinction between node-wise optimality and statewise optimality. Finally, we empirically demonstrate that our method achieves a compromise between loss of performance and computational gain.

show abstract

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning, Extended version

Lecarpentier¹,

Rachelson²

2019

Preprint

View full text Add to dashboard Cite

Lipschitz Lifelong Reinforcement Learning

Lecarpentier

Abel

Asadi

et al. 2021

AAAI

View full text Add to dashboard Cite

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions. Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space. These theoretical results lead us to a value-transfer method for Lifelong RL, which we use to build a PAC-MDP algorithm with improved convergence rate. Further, we show the method to experience no negative transfer with high probability. We illustrate the benefits of the method in Lifelong RL experiments.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.