2020
DOI: 10.1007/978-3-030-57628-8_1
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning with Temporal Logics

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
34
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 37 publications
(34 citation statements)
references
References 30 publications
0
34
0
Order By: Relevance
“…Figure 11(c) shows the average and the maximum number of steps required to terminate for all the engines with every specification across 100 executions in logarithmic scale. The number of steps is a known measure used to compare RL methods logically constrained with LTL formulae [40‐43]. Known RL‐LTL methods take a high number of steps, in the order of hundreds of thousands, because these methods aim to converge to an optimal policy.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Figure 11(c) shows the average and the maximum number of steps required to terminate for all the engines with every specification across 100 executions in logarithmic scale. The number of steps is a known measure used to compare RL methods logically constrained with LTL formulae [40‐43]. Known RL‐LTL methods take a high number of steps, in the order of hundreds of thousands, because these methods aim to converge to an optimal policy.…”
Section: Discussionmentioning
confidence: 99%
“…Several studies [40‐43] use LTL specifications as a high‐level guide for an RL agent. The RL agent in these studies never terminate and has to avoid violating a given specification indefinitely.…”
Section: Related Workmentioning
confidence: 99%
“…Our model extends Araki et al (2019b), which builds upon the Value Iteration Network (VIN) model (Tamar et al 2016) by applying a more structured variant of VIN to the product of a low-level MDP with a logical specification defined by an FSA. Other works incorporating logical structure into the imitation learning setting include Paxton et al (2017), Li, Ma, andBelta (2017), Hasanbeig, Abate, and Kroening (2018), Icarte et al (2018), Burke, Penkov, and Ramamoorthy (2019), and Gordon, Fox, and Farhadi (2019). These models assume that at least part of the logic specification is known, and they are not interpretable and manipulable.…”
Section: Related Workmentioning
confidence: 99%
“…[32] uses LTL to define constraints on a Monte Carlo Tree Search. [28] and [18] use the product of an LTL-derived FSA with an MDP to make learning more efficient.…”
Section: Related Workmentioning
confidence: 99%
“…In [21] the authors use LTL to design a sub-task extraction procedure as part of a more standard deep reinforcement learning setup. However, these methods assume the LTL specifications are already known, and [32,33,21,18] do not allow for a model that is easy to interpret and manipulate. By contrast, our model only requires the current FSA state and the location of logic propositions in the environment.…”
Section: Related Workmentioning
confidence: 99%