2019
DOI: 10.1109/jiot.2019.2921159
|View full text |Cite
|
Sign up to set email alerts
|

Deep Deterministic Policy Gradient (DDPG)-Based Energy Harvesting Wireless Communications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
113
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 284 publications
(113 citation statements)
references
References 27 publications
0
113
0
Order By: Relevance
“…This method is a policy-based DRL algorithm, that is, in DDPG, a Action network can directly generate corresponding output actions based on the input state vector. A Critic network in DDPG will evaluate the actions generated by the Action network and continuously optimize the action selection strategy of the Action network [36]- [38].…”
Section: Future Workmentioning
confidence: 99%
“…This method is a policy-based DRL algorithm, that is, in DDPG, a Action network can directly generate corresponding output actions based on the input state vector. A Critic network in DDPG will evaluate the actions generated by the Action network and continuously optimize the action selection strategy of the Action network [36]- [38].…”
Section: Future Workmentioning
confidence: 99%
“…The work in [14] considered joint optimization of traffic scheduling and power allocation and minimized the total on-grid power consumption of macro and small cells, while guaranteeing users' traffic requirement. Recently, some works have focused on solving the EH problems with reinforcement learning, where novel energy management was proposed in [15] based on reinforcement learning to maximize packet transmission rates while avoiding energy outage for wireless sensor networks, and energy management was investigated in [16] to maximize the net bit rate in EH wireless communications by using deep deterministic policy gradient.…”
Section: Related Workmentioning
confidence: 99%
“…The computational complexity of calculating the proposed MDP scheme mainly comes from (16). Note that the total number of system states is N 2 h N c N 2 b and the number of actions at each state is no more than 2 N b + 1.…”
Section: F Implementation Complexity and Overheadmentioning
confidence: 99%
See 2 more Smart Citations