2019
DOI: 10.1109/access.2019.2930115
|View full text |Cite
|
Sign up to set email alerts
|

Non-Cooperative Energy Efficient Power Allocation Game in D2D Communication: A Multi-Agent Deep Reinforcement Learning Approach

Abstract: Recently, there is the widespread use of mobile devices and sensors, and rapid emergence of new wireless and networking technologies, such as wireless sensor network, device-to-device (D2D) communication, and vehicular ad hoc networks. These networks are expected to achieve a considerable increase in data rates, coverage, and the number of connected devices with a significant reduction in latency and energy consumption. Because there are energy resource constraints in user's devices and sensors, the problem of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
52
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 74 publications
(52 citation statements)
references
References 33 publications
0
52
0
Order By: Relevance
“…And, the corresponding distributed iterative algorithm were also proposed and evaluated. In [21], a simulation-based optimization framework for D2D network was proposed to achieve a tradeoff between the energy consumption and network performance. Although the existence of Nash equilibrium point was not proved theoretically, the simulation results showed that the proposed optimization framework has an excellent performance in most case.…”
Section: Related Workmentioning
confidence: 99%
“…And, the corresponding distributed iterative algorithm were also proposed and evaluated. In [21], a simulation-based optimization framework for D2D network was proposed to achieve a tradeoff between the energy consumption and network performance. Although the existence of Nash equilibrium point was not proved theoretically, the simulation results showed that the proposed optimization framework has an excellent performance in most case.…”
Section: Related Workmentioning
confidence: 99%
“…Very recently, deterministic policy gradient (DPG) is deployed as an actor-critic algorithm in which the policy gradient theorem is extended from stochastic policy to deterministic policy. Inspired by the success of deep Q-learning [26], which uses neural network function approximation to learn value functions for a very large state and action space online, the combination of DPG and deep learning called deep deterministic policy gradient enables learning in continuous spaces.…”
Section: Distributed Deep Deterministic Policy Gradientmentioning
confidence: 99%
“…Deep reinforcement learning, a combination of RL and deep neural network, has been used widely in wireless communication thanks to its powerful features, impressive performance, and adequate processing time. The authors in [26] formulated a non-cooperative power allocation game in D2D communications and proposed three approaches based on deep Q-learning, double deep Q-learning, and dueling deep Q-learning algorithm for multi-agent learning to find the optimal power level for each D2D pair in order to maximise the network performance. The authors in [27] used deep Q-learning algorithm to look for the optimal sub-band and transmission power level for each V2V user in V2V communications while satisfying the requirement of low latency.…”
Section: Introductionmentioning
confidence: 99%
“…The characteristics of each fault current reduction method introduced above are summarized in Table 1. This study applies reinforcement learning (RL) [22][23][24][25][26][27][28][29][30][31][32][33][34] to conduct bus and line separation more systematically; these are the most widely used techniques for grid operation as they can be performed immediately and without additional cost. Because there are many buses and lines in a grid, there are numerous ways to reduce short circuit current.…”
Section: Introductionmentioning
confidence: 99%