Optimal transmission policy in energy harvesting wireless communications: A learning approach

Wu, Keyu; Tellambura, Chintha; Jiang, Hai

doi:10.1109/icc.2017.7997233

Cited by 17 publications

(6 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…ESS can enumerate all possible solutions during the short-term time horizon and thus attain an optimal solution. QLA, a well-known reinforcement learning program, is widely used to solve some long-term or short-term utilities [36,37]. To better assess the effectiveness of the proposed algorithms, we implement QLA as a centralized one.…”

Section: Simulation Resultsmentioning

confidence: 99%

Energy-Efficient Time-Domain Equilibrium Scheduling and Optimization Scheme for Energy Harvesting-Powered D2D Communication

et al. 2020

View full text Add to dashboard Cite

Energy Harvesting- (EH-) powered Device-to-Device (D2D) Communication underlaying Cellular Network (EH-DCCN) has been deemed as one of the basic building blocks of Internet of Things due to its green energy efficiency and adjacent communication. But available energy will be one of the biggest obstacles when implementing EH-DCCN due to the immaturity of EH technology and the volatility of environmental energy resources. To improve energy utilization, this study investigates an efficient scheduling and power allocation scheme about transmission load equilibrium in the time domain. Accordingly, a short-term Sum Energy Efficiency (stSEE) maximization problem for EH-powered D2D communication is modelled, while ensuring a fundamental transmission rate requirement of cellular users. Consequently, the optimization problem is a nonconvex mixed integer nonlinear programming problem. Thus, we propose a two-layer convex approximation iteration algorithm which can obtain a feasible quasioptimal solution for the stSEE problem. Simultaneously, a two-step heuristic algorithm in a slot-by-slot fashion is also developed to acquire a suboptimal solution without requiring statistical knowledge of channel and energy arrival processes. Simulated analysis indicates that the short-term scheduling strategy can obtain better performances in terms of energy efficiency and transmission rate than conventional real-time scheduling scheme. Besides, the maximum scheduled number of EH-D2D pairs underlaying one cellular user under different EH efficiency is analysed, which can give us a theoretical reference about the deployment of future EH-DCCN.

show abstract

Section: Simulation Resultsmentioning

confidence: 99%

Energy-Efficient Time-Domain Equilibrium Scheduling and Optimization Scheme for Energy Harvesting-Powered D2D Communication

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Therefore, UAVs' policy indicates the probability distribution of changed location in each time slot for each possible state. UAV , ∀ ∈ ℳ experiences the environment by taking suitable action in a particular state following policy ∈ Π where the expected mapping value between state and action can be expressed using Bellman equation [28] as where * is the optimal policy of UAV under arbitrary state ( ) ∈ . This optimal policy can generate maximum discounted cumulative reward than any other policies which are the elements of policy space Π.…”

Section: Reinforcement Learning Based On Sarsa Methods 41 Mapping Between State and Action For Optimal Decisionmentioning

confidence: 99%

“…According to (27) and (28), it is observed that learning step sizes and initial velocity of UAVs influence the evolution of instantaneous transmission rate of the network and convergence properties corresponding to the proposed SARSA algorithm. Fig.…”

Section: Impact Of Learning Parameters On Deployment Strategymentioning

confidence: 99%

“…̃, ( ) = 1 when = arg max ∈ ̅ , ( ), otherwise ̃, ( ) = 036:According to(28), update channel selection probability as ̅ , ( ) ← ̅ , ( + 1) as ( + 1) = ( ( + 1), ( + 1), ) where ( + 1) and ( + 1) are calculated by (24) and (25) 42:Calculate the immediate reward ℛ( ( ), ( )) of UAV by(26) 43: Choose the action ( + 1) = { ( + 1), ( + 1)} by (38) and obtain corresponding ( ( + 1), (…”

mentioning

confidence: 99%

See 1 more Smart Citation

Joint Optimization Framework For Maximization of Instantaneous Transmission Rate In Signal To Interference Noise Ratio Constrained UAVs-Supported Self-Organized Device-To-Device Network

Mondal

Hossain

2021

Preprint

View full text Add to dashboard Cite

Due to their high maneuverability, flexible deployment, and line of sight (LoS) transmission, unmanned aerial vehicles (UAVs) could be an alternative option for reliable device-to-device (D2D) communication when a direct link is not available between source and destination devices due to obstacles in the signal propagation path. Therefore, in this paper, we have proposed a UAVs-supported self-organized device-to-device (USSD2D) network where multiple UAVs are employed as aerial relays. We have developed a novel optimization framework that maximizes the total instantaneous transmission rate of the network by jointly optimizing the deployed location of UAVs, device association, and UAVs’ channel selection while ensuring that every device should achieve a given signal to interference noise ratio (SINR) constraint. As this joint optimization problem is nonconvex and combinatorial, we adopt reinforcement learning (RL) based solution methodology that effectively decouples it into three individual optimization problems. The formulated problem is transformed into a Markov decision process (MDP) where UAVs learn the system parameters according to the current state and corresponding action aiming to maximize the generated reward under the current policy. Finally, we conceive SARSA, a low complexity iterative algorithm for updating the current policy in the case of randomly deployed device pairs which achieves a good computational complexity-optimality tradeoff. Numerical results validate the analysis and provide various insights on the optimal deployment of UAVs. The proposed methodology improves the total instantaneous transmission rate of the network by 75.37%, 52.08%, and 14.77% respectively as compared with RS-FORD, ES-FIRD, and AOIV schemes.

show abstract

“…The model could be Markov decision process (MDP) or regression model based on statistic data [20]. In practical situation, the future channel state is unknown, and the learning theoretic approach is suitable for unpredictable case [21]. In this case, the transmitter learns the optimal energy allocation policies by performing actions and observing their rewards.…”

Section: Related Workmentioning

confidence: 99%

Online Power Control and Optimization for Energy Harvesting Communication System Based on State of Charge

Guo

Zhang

2021

Wireless Pers Commun

View full text Add to dashboard Cite

In this paper, the online power control problem for energy harvesting wireless communication system with a finite storage capacity battery is addressed, where the channel state and energy harvesting rate are both unknown. A low complexity algorithm based online convex optimization is proposed to guarantee energy availability of energy harvesting node and maximize average long-term throughput. The proposed algorithm restricts maximum transmission power with the information of state of charge, and allocates transmission power based on historical information. In addition, energy availability constraint is given by rigorous theoretical analysis to guarantee the optimization of average long-term throughput. Simulations have been conducted to demonstrate the effectiveness of the algorithm without considering probability distribution of energy arrival or channel coefficients. The proposed algorithm outperforms counterparts in different energy harvested rates.

show abstract

Optimal transmission policy in energy harvesting wireless communications: A learning approach

Cited by 17 publications

References 13 publications

Energy-Efficient Time-Domain Equilibrium Scheduling and Optimization Scheme for Energy Harvesting-Powered D2D Communication

Energy-Efficient Time-Domain Equilibrium Scheduling and Optimization Scheme for Energy Harvesting-Powered D2D Communication

Joint Optimization Framework For Maximization of Instantaneous Transmission Rate In Signal To Interference Noise Ratio Constrained UAVs-Supported Self-Organized Device-To-Device Network

Online Power Control and Optimization for Energy Harvesting Communication System Based on State of Charge

Contact Info

Product

Resources

About