Proceedings of the 9th International Conference on Agents and Artificial Intelligence 2017
DOI: 10.5220/0006197105590566
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Reinforcement Learning in a Cyber Security Simulation

Abstract: Abstract:This paper focuses on cyber-security simulations in networks modeled as a Markov game with incomplete information and stochastic elements. The resulting game is an adversarial sequential decision making problem played with two agents, the attacker and defender. The two agents pit one reinforcement learning technique, like neural networks, Monte Carlo learning and Q-learning, against each other and examine their effectiveness against learning opponents. The results showed that Monte Carlo learning with… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
48
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 59 publications
(56 citation statements)
references
References 11 publications
0
48
0
Order By: Relevance
“…An agent benefiting from reinforcement learning has the following dilemma: choosing between an action that is considered the best (exploitation) or choosing other actions to see if any of these actions is better (exploration). For the Monte Carlo approach, four different research algorithms are used to attempt to address this problem in cyber security, namely e-greedy, Softmax, Upper Confidence Bound 1 and Discounted Upper Confidence Bound [12].…”
Section: Simulation and Resultsmentioning
confidence: 99%
“…An agent benefiting from reinforcement learning has the following dilemma: choosing between an action that is considered the best (exploitation) or choosing other actions to see if any of these actions is better (exploration). For the Monte Carlo approach, four different research algorithms are used to attempt to address this problem in cyber security, namely e-greedy, Softmax, Upper Confidence Bound 1 and Discounted Upper Confidence Bound [12].…”
Section: Simulation and Resultsmentioning
confidence: 99%
“…In the game model, the zero-sum game model is under incomplete information on cyberspace, in which both the attacker and the defender attempt to win the game, and this game process cannot be described by the classifier [25]. Also, a method that combined reinforcement learning and supervised learning was proposed and then applied to the malicious traffic detection model to achieve the integration of reinforcement learning and supervised learning and achieved better performance [26].…”
Section: Related Workmentioning
confidence: 99%
“…Model-free approaches in which the agent is provided with minimal information about the structure of the problem have been recently considered through the adoption of RL [10,15,29,30]. While these works focus on the application of RL to solve specific challenges, in this paper we analyze the problem of how to define in a versatile and consistent way relevant CTF problems for RL.…”
Section: Related Workmentioning
confidence: 99%