Metrics and continuity in reinforcement learning

Lan, Charline Le; Bellemare, Marc G.; Castro, Pablo Samuel

doi:10.48550/arxiv.2102.01514

Cited by 2 publications

(3 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The idea to estimate the explanation element (E1) is to find the feature-sets of similar states that frequently appear conditioned on a particular action a determined by the policy π. That is, for a given state s for which users are asking for an explanation, we compute the appearance frequency of features in similar states to s. We chose to use a value-based metric since it's simple and can be approximated to reduce its computational cost [26]. Our similarity metric groups states that have a similar value v to the reference v ref erence within the range 1.0 ± 0.05 × v ref erence .…”

Section: A Generating Explanationsmentioning

confidence: 99%

Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors

Cruz¹,

Igarashi²

2021

Preprint

View full text Add to dashboard Cite

Section: A Generating Explanationsmentioning

confidence: 99%

Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors

Cruz¹,

Igarashi²

2021

Preprint

View full text Add to dashboard Cite

“…Trivially any problem can be embedded in a metric space where the metric is taken to be the difference of values of the optimal Q function. Recent work has investigated the options of selecting a metric in terms of its induced topological structure on the space [28].…”

Section: Metric Space and Lipschitz Assumptionsmentioning

confidence: 99%

“…This assumes access to the similarity metrics. Learning the metric (or picking the metric) is important in practice, but beyond the scope of this paper [62,28]. Assumption 3.…”

Section: Metric Space and Lipschitz Assumptionsmentioning

confidence: 99%

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Sinclair

Banerjee

2020

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

View full text Add to dashboard Cite

We introduce the technique of adaptive discretization to design efficient model-based episodic reinforcement learning algorithms in large (potentially continuous) state-action spaces. Our algorithm is based on optimistic one-step value iteration extended to maintain an adaptive discretization of the space. From a theoretical perspective, we provide worst-case regret bounds for our algorithm, which are competitive compared to the state-of-the-art model-based algorithms; moreover, our bounds are obtained via a modular proof technique, which can potentially extend to incorporate additional structure on the problem. From an implementation standpoint, our algorithm has much lower storage and computational requirements, due to maintaining a more efficient partition of the state and action spaces. We illustrate this via experiments on several canonical control problems, which shows that our algorithm empirically performs significantly better than fixed discretization in terms of both faster convergence and lower memory usage. Interestingly, we observe empirically that while fixed-discretization model-based algorithms vastly outperform their model-free counterparts, the two achieve comparable performance with adaptive discretization. 1

show abstract

Metrics and continuity in reinforcement learning

Cited by 2 publications

References 0 publications

Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors

Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Contact Info

Product

Resources

About