2019
DOI: 10.1109/jsac.2019.2933973
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

2
331
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 531 publications
(334 citation statements)
references
References 34 publications
2
331
0
1
Order By: Relevance
“…Recently, Guo et al [111] have proposed a novel multiagent framework, which includes a centralized agent and multiple intelligent nodes. In particular, the centralized agent is responsible for training a common RL model for all the intelligent nodes and each intelligent node makes action decisions independently according to the trained RL model.…”
Section: B Intelligent Action 1) Reinforcement Learning Enabled Intementioning
confidence: 99%
“…Recently, Guo et al [111] have proposed a novel multiagent framework, which includes a centralized agent and multiple intelligent nodes. In particular, the centralized agent is responsible for training a common RL model for all the intelligent nodes and each intelligent node makes action decisions independently according to the trained RL model.…”
Section: B Intelligent Action 1) Reinforcement Learning Enabled Intementioning
confidence: 99%
“…Indeed, this is fundamental and helps the agent to figure out exactly which action to perform until it converges to an optimal policy. Nevertheless, as highlighted in [13], Q-learning has two serious problems: (1) the amount of memory required to save and update the Q-table can increase exponentially as the number of states and actions increases, (2) many states are rarely visited and, consequently, the amount of time required to explore all these possibilities (state/action pairs) in order to create a good estimate for Q-table would be unrealistic or impractical in a real setting [14].…”
Section: Problem Formulation and Optimal Solutionmentioning
confidence: 99%
“…Then, the train DQN updates its parameters with the new parameters provided by training. According to [14], the convergence to a set of good parameters occurs quickly.…”
Section: Problem Formulation and Optimal Solutionmentioning
confidence: 99%
See 2 more Smart Citations