2022
DOI: 10.1007/s10489-022-03363-0
|View full text |Cite
|
Sign up to set email alerts
|

Improving the exploration efficiency of DQNs via the confidence bound methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 18 publications
0
2
0
Order By: Relevance
“…On the other hand, the use of empirical samples from different strategies reduces the correlations between data and improves the generalization ability of the algorithm. The MDRL-TP multiagent cross-domain routing algorithm also uses a decaying ε-greedy detection mechanism (Wen et al. , 2022):where x denotes a random variable between [0,1].…”
Section: Mdrl-tp Multiagent Cross-domain Routing Algorithm Designmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, the use of empirical samples from different strategies reduces the correlations between data and improves the generalization ability of the algorithm. The MDRL-TP multiagent cross-domain routing algorithm also uses a decaying ε-greedy detection mechanism (Wen et al. , 2022):where x denotes a random variable between [0,1].…”
Section: Mdrl-tp Multiagent Cross-domain Routing Algorithm Designmentioning
confidence: 99%
“…On the other hand, the use of empirical samples from different strategies reduces the correlations between data and improves the generalization ability of the algorithm. The MDRL-TP multiagent cross-domain routing algorithm also uses a decaying ε-greedy detection mechanism (Wen et al, 2022): Cross-domain routing method in SDN 5.2 Prediction algorithm performance and experimental parameter analysis By using the GRU-based network traffic state prediction algorithm developed in our previous work (Huang et al, 2022), an agent can learn to obtain a higher reward value. The reason for this is that the GRU-based prediction algorithm can monitor hidden network traffic states in a largescale SDN under multicontroller management, which are difficult to obtain based solely on the multithreaded SDN measurement mechanism and cooperative communication module.…”
Section: Dueling Dqn Drl Algorithmmentioning
confidence: 99%