“…A large number of studies have focused on applying reinforcement learning to use cases similar to the intrusion response use case we discuss in this paper [9]- [11], [17]- [52], [64], [72]. These works use a variety of models, including MDPs [20], [23], [25], [26], [31], [34], [36], [42], [51], [52], [64], Stochastic games [10], [18], [28], [33], [45], [72], attack graphs [35], Petri nets [43], and POMDPs [9], [11], [21], [27], as well as various reinforcement learning algorithms, including Q-learning [18], [20], [23], [40], [43], [48], [64], [69], SARSA [21], PPO [10], [11], [34], [35], [37], hierarchical reinforcement learning [25], DQN [26], [36]-…”