DDPG-Based Throughput Optimization with AoI Constraint in Ambient Backscatter-Assisted Overlay CRN

Jia, Xueli; Zheng, Kechen; Chi, Kaikai; Liu, Xiaoying

doi:10.3390/s22093262

Cited by 3 publications

(1 citation statement)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For a WPCRN, the average AoI minimization problem was formulated as partially observable markov decision process and solved by dynamic programming in [44]. In [45], the network throughput maximization problem with AoI constraint was studied for a WPCRN with backscatter communication, and the DDPG algorithm was proposed. To the best of our knowledge, few research works have investigated the AoI-oriented resource allocation for NOMA-based WPCRN.…”

Section: Introductionmentioning

confidence: 99%

AoI-Oriented Resource Allocation for NOMA-Based Wireless Powered Cognitive Radio Networks Based on Multi-Agent Deep Reinforcement Learning

He,

Peng,

Liu

et al. 2024

IEEE Access

View full text Add to dashboard Cite

In this paper, we study a wireless powered cognitive internet of things (IoT) network, where cognitive radio (CR) and non-orthogonal multiple access (NOMA) technologies are exploited to improve spectral efficiency, and radio frequency based energy harvesting (RF-EH) technology is integrated to achieve the sustainable IoT network. To ensure the freshness of information delivery, we investigate the age of information (AoI) as a performance metric, and formulate a long-term average AoI minimization problem under energy sustainability constraint, in which the working mode and transmit power of the secondary devices (SDs) are jointly optimized. Then, we reformulate it as a decentralized Markov decision process (Dec-MDP) with continuous action space. Accordingly, a deep reinforcement learning (DRL) framework is exploited, and a multi-agent twin delayed deep deterministic policy gradient algorithm with dual action selection mechanism (MATD3-DAS) is proposed, which adopts the centralized training and decentralized execution (CTDE) framework and exploits both actor and critic networks to select actions for improving exploration ability. Simulation results show that the proposed algorithm can significantly reduce the longterm average AoI, where the decrements approach 9.58% and 52.34% compared with the MATD3 algorithm and TD3-DAS algorithm with centralized training and centralized execution (CTCE). INDEX TERMSCognitive radio, non-orthogonal multiple access, radio frequency based energy harvesting, age of information, multi-agent deep reinforcement learning.

show abstract

Section: Introductionmentioning

confidence: 99%