2017 IEEE International Conference on Communications (ICC) 2017
DOI: 10.1109/icc.2017.7997233
|View full text |Cite
|
Sign up to set email alerts
|

Optimal transmission policy in energy harvesting wireless communications: A learning approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 13 publications
0
6
0
Order By: Relevance
“…ESS can enumerate all possible solutions during the short-term time horizon and thus attain an optimal solution. QLA, a well-known reinforcement learning program, is widely used to solve some long-term or short-term utilities [36,37]. To better assess the effectiveness of the proposed algorithms, we implement QLA as a centralized one.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…ESS can enumerate all possible solutions during the short-term time horizon and thus attain an optimal solution. QLA, a well-known reinforcement learning program, is widely used to solve some long-term or short-term utilities [36,37]. To better assess the effectiveness of the proposed algorithms, we implement QLA as a centralized one.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…Therefore, UAVs' policy indicates the probability distribution of changed location in each time slot for each possible state. UAV , ∀ ∈ ℳ experiences the environment by taking suitable action in a particular state following policy ∈ Π where the expected mapping value between state and action can be expressed using Bellman equation [28] as where * is the optimal policy of UAV under arbitrary state ( ) ∈ . This optimal policy can generate maximum discounted cumulative reward than any other policies which are the elements of policy space Π.…”
Section: Reinforcement Learning Based On Sarsa Methods 41 Mapping Between State and Action For Optimal Decisionmentioning
confidence: 99%
“…According to (27) and (28), it is observed that learning step sizes and initial velocity of UAVs influence the evolution of instantaneous transmission rate of the network and convergence properties corresponding to the proposed SARSA algorithm. Fig.…”
Section: Impact Of Learning Parameters On Deployment Strategymentioning
confidence: 99%
See 1 more Smart Citation
“…The model could be Markov decision process (MDP) or regression model based on statistic data [20]. In practical situation, the future channel state is unknown, and the learning theoretic approach is suitable for unpredictable case [21]. In this case, the transmitter learns the optimal energy allocation policies by performing actions and observing their rewards.…”
Section: Related Workmentioning
confidence: 99%