Adversarial Reinforcement Learning in a Cyber Security Simulation

Elderman, Richard; Pater, Leon J.J.; Thie, Albert; Drugan, Mădălina M.; Wiering, Marco A.

doi:10.5220/0006197105590566

Cited by 59 publications

(56 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…An agent benefiting from reinforcement learning has the following dilemma: choosing between an action that is considered the best (exploitation) or choosing other actions to see if any of these actions is better (exploration). For the Monte Carlo approach, four different research algorithms are used to attempt to address this problem in cyber security, namely e-greedy, Softmax, Upper Confidence Bound 1 and Discounted Upper Confidence Bound [12].…”

Section: Simulation and Resultsmentioning

confidence: 99%

Developing Monte Carlo Simulator of Reinforcement Learning Type

Tsochev¹

2020

PECR

View full text Add to dashboard Cite

Monte Carlo methods are a way to solve the reinforcement learning problem based on average test results. To ensure that well-defined results are available, Monte Carlo methods are used only for episodic tasks. The Monte Carlo term is often used more widely in any valuation method whose operation involves significant participation on a random basis. Here it is specifically used for methods based on the average of full results (as opposed to methods that are learned from incomplete results). The paper describes a simulator for estimating raindrops in a specific area using the package matlib. Keywords: Monte Carlo, reinforcement learning, simulation, matlib, python

show abstract

Section: Simulation and Resultsmentioning

confidence: 99%

Developing Monte Carlo Simulator of Reinforcement Learning Type

Tsochev¹

2020

PECR

View full text Add to dashboard Cite

show abstract

“…In the game model, the zero-sum game model is under incomplete information on cyberspace, in which both the attacker and the defender attempt to win the game, and this game process cannot be described by the classifier [25]. Also, a method that combined reinforcement learning and supervised learning was proposed and then applied to the malicious traffic detection model to achieve the integration of reinforcement learning and supervised learning and achieved better performance [26].…”

Section: Related Workmentioning

confidence: 99%

A Hidden Attack Sequences Detection Method Based on Dynamic Reward Deep Deterministic Policy Gradient

Liu

Pan

et al. 2022

Security and Communication Networks

View full text Add to dashboard Cite

Attacker identification from network traffic is a common practice of cyberspace security management. However, network administrators cannot cover all security equipment due to the cyberspace management cost constraints, giving attackers the chance to escape from the surveillance of network security administrators by legitimate actions and to perform the attack in both physical domain and digital domain. Therefore, we proposed a hidden attack sequence detection method based on reinforcement learning to deal with the challenge through modeling the network administrators as an intelligent agent that learns their action policy from the interaction with the cyberspace environment. Following Deep Deterministic Policy Gradient (DDPG), the intelligent agent can not only discover the hidden attackers hiding in the legitimate action sequences but also reduce the cyberspace management cost. Furthermore, a dynamic reward DDPG method was proposed to improve defense performance, which set dynamic reward depending on the hidden attack sequences steps and agent’s check steps, compared to the fixed reward in common methods. Meanwhile, the method was verified in a simulated experimental cyberspace environment. Finally, the experimental results demonstrate that there are hidden attack sequences in cyberspace, and the proposed method can discover the hidden attack sequences. The dynamic reward DDPG shows superior performance in detecting hidden attackers, with a detection rate of 97.46%, which can improve the ability to discover hidden attackers and reduce the 6% cyberspace management cost compared to DDPG.

show abstract

“…Model-free approaches in which the agent is provided with minimal information about the structure of the problem have been recently considered through the adoption of RL [10,15,29,30]. While these works focus on the application of RL to solve specific challenges, in this paper we analyze the problem of how to define in a versatile and consistent way relevant CTF problems for RL.…”

Section: Related Workmentioning

confidence: 99%

The Agent Web Model: modeling web hacking for reinforcement learning

Erdődi

Zennaro

2021

Int. J. Inf. Secur.

View full text Add to dashboard Cite

Website hacking is a frequent attack type used by malicious actors to obtain confidential information, modify the integrity of web pages or make websites unavailable. The tools used by attackers are becoming more and more automated and sophisticated, and malicious machine learning agents seem to be the next development in this line. In order to provide ethical hackers with similar tools, and to understand the impact and the limitations of artificial agents, we present in this paper a model that formalizes web hacking tasks for reinforcement learning agents. Our model, named Agent Web Model, considers web hacking as a capture-the-flag style challenge, and it defines reinforcement learning problems at seven different levels of abstraction. We discuss the complexity of these problems in terms of actions and states an agent has to deal with, and we show that such a model allows to represent most of the relevant web vulnerabilities. Aware that the driver of advances in reinforcement learning is the availability of standardized challenges, we provide an implementation for the first three abstraction layers, in the hope that the community would consider these challenges in order to develop intelligent web hacking agents.

show abstract

Adversarial Reinforcement Learning in a Cyber Security Simulation

Cited by 59 publications

References 11 publications

Developing Monte Carlo Simulator of Reinforcement Learning Type

Developing Monte Carlo Simulator of Reinforcement Learning Type

A Hidden Attack Sequences Detection Method Based on Dynamic Reward Deep Deterministic Policy Gradient

The Agent Web Model: modeling web hacking for reinforcement learning

Contact Info

Product

Resources

About