2022
DOI: 10.1007/978-3-031-21213-0_3
|View full text |Cite
|
Sign up to set email alerts
|

COOL-MC: A Comprehensive Tool for Reinforcement Learning and Model Checking

Abstract: This paper presents COOL-MC, a tool that integrates stateof-the-art reinforcement learning (RL) and model checking. Specifically, the tool builds upon the OpenAI gym and the probabilistic model checker Storm. COOL-MC provides the following features: (1) a simulator to train RL policies in the OpenAI gym for Markov decision processes (MDPs) that are defined as input for Storm, (2) a new model builder for Storm, which uses callback functions to verify (neural network) RL policies, (3) formal abstractions that re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 34 publications
0
3
0
Order By: Relevance
“…Our method, on the other hand, gives us a reachability probability done = 0.58 (see Table 1). However, at some point, our model checking method is also limited by the size of the induced DTMC and runs out of memory (Gross et al 2022).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our method, on the other hand, gives us a reachability probability done = 0.58 (see Table 1). However, at some point, our model checking method is also limited by the size of the induced DTMC and runs out of memory (Gross et al 2022).…”
Section: Discussionmentioning
confidence: 99%
“…Recall, the joint policy π induced by the set of all agent policies {π i } i∈I is a single policy π (Boutilier 1996). The tool COOL-MC 1 (Gross et al 2022) allows model checking of a single RL policy against a user-provided PCTL property and MDP. Thereby, it builds the induced DTMC incrementally (Cassez et al 2005).…”
Section: Model Checking Of Cmarl Agentsmentioning
confidence: 99%
“…Summary. These results are part of the publications in [15,29,35,38,55]. A common approach to safe reinforcement learning is to employ a so-called shield that forces an RL agent to select only safe actions.…”
Section: Safe Deep Reinforcement Learningmentioning
confidence: 99%