2020
DOI: 10.1109/access.2020.2974786
|View full text |Cite
|
Sign up to set email alerts
|

An Intelligent Deployment Policy for Deception Resources Based on Reinforcement Learning

Abstract: Traditional deception-based cyber defenses (DCD) often adopt the static deployment policy that places the deception resources in some fixed positions in the target network. Unfortunately, the effectiveness of these deception resources has been greatly restricted by the static deployment policy, which also causes the deployed deception resources to be easily identified and bypassed by attackers. Moreover, the existing studies on dynamic deployment policy, which make many strict assumptions and constraints, are … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(23 citation statements)
references
References 29 publications
0
23
0
Order By: Relevance
“…Next, the capturing ability of the deployment strategy was evaluated and compared with the other four strategies: static deployment, dynamic deployment following alarms, the Q_Learning method proposed in the literature [29], and the N2-DQN method proposed in the literature [24]. e experiment set up a total of 3 situations (0.7/0.2/0.1, 0.2/0.7/ 0.1, and 0.3/0.3/0.4), and 200 attacks were carried out in each situation.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Next, the capturing ability of the deployment strategy was evaluated and compared with the other four strategies: static deployment, dynamic deployment following alarms, the Q_Learning method proposed in the literature [29], and the N2-DQN method proposed in the literature [24]. e experiment set up a total of 3 situations (0.7/0.2/0.1, 0.2/0.7/ 0.1, and 0.3/0.3/0.4), and 200 attacks were carried out in each situation.…”
Section: Resultsmentioning
confidence: 99%
“…Zhang et al [28] established an incomplete information random game model for the attack and defense process after the attackers entered the real system and used an improved Q_Learning algorithm to solve the problem, but the decision in this document focused on the system state transitions with different defense actions, the algorithm requires all network states to be known, but our algorithm does not require all states, there are fewer restrictions on scenario settings, and there is no need to guess the attackers' type in advance, which reduces the difficulty of scene modeling and is more practical. Wang et al [29] combined the two-layer threat penetration map, Q_Learning, and dynamic deployment to propose a dynamic deployment strategy of deception resources based on reinforcement learning, their strategy can also predict the attackers' next path based on the attackers' current alert paths, but the attackers considered by this method are a single-mode attack, the defender uses offline learning during the entire learning process, and it can only learn a known attack mode for defense based on the collected attack information. In addition, this method considers that the attacker has compromised a certain system node, so, the starting position of the attack path has been determined and cannot be predeployed.…”
Section: Related Workmentioning
confidence: 99%
“…Wang et al [137] identified an optimal deployment strategy for deception resource, such as honeypots. Specifically, the authors developed a Q-learning algorithm for an intelligent deployment policy for deception resources to dynamically place deception resources as the network security state changes.…”
Section: Pros and Consmentioning
confidence: 99%
“…In [88], the authors design the information revealed to the users and attackers to elicit behaviors in favor of the defender. In [89] and [90], RL is used to optimally deploy the deception resources and emulate adversaries, respectively. Many works have attempted to address various spoofing attacks using RL.…”
Section: Information-related Vulnerabilitymentioning
confidence: 99%