Link-Level Throughput Maximization Using Deep Reinforcement Learning

Jamshidiha, Saeed; Pourahmadi, Vahid; Mohammadi, Abbas; Bennis, Mehdi

doi:10.1109/lnet.2020.3000334

Cited by 3 publications

(4 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since the probability density function of the Rayleigh distribution is known, probabilities corresponding to the defined SNR, hence CQI, intervals can be calculated. In conclusion, using the packet arrival probabilities and state transitions expressed in (18) and (20), and CQI probabilities, we can obtain P n l ss ′ for all states and all actions. We remark that the formulated MDP has a countable-state space considering both ∆ q (l) ∈ {0, 1, .…”

Section: Adaptive Blocklength Selection For Minimizing Age Violation ...mentioning

confidence: 95%

“…Some works in the literature also use RL techniques for AMC to optimize traditional performance metrics such as throughput [17], [18] and spectral efficiency [19]. However, none of them consider dynamic MCS selection in AoIaware systems.…”

Section: Introductionmentioning

confidence: 99%

“…[19] aims to maximize spectral efficiency and maintain a low block error rate (BLER) while [17] optimizes the link throughput in orthogonal frequency-division multiplexing (OFDM) wireless systems. [18] also maximizes the link-level throughput with MCS selection and power allocation by Deep Deterministic Policy Gradient (DDPG) agents in a distributed manner. MCS selection in age-aware systems has been considered only in [20], where an AoI-driven scheduler without any learning-based approach or any finite blocklength analysis is proposed to minimize the long-term average AoI.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Reinforcement Learning Based Adaptive Blocklength and MCS for Optimizing Age Violation Probability

Ozkaya,

Topbas,

Ceran

2023

IEEE Access

View full text Add to dashboard Cite

As a measure of the freshness of data, Age of Information (AoI) has become an essential performance metric in status update applications with stringent timeliness constraints. This study employs adaptive strategies to minimize the novel, information freshness-based performance metric age violation probability (AVP), the probability of the instantaneous age exceeding a predefined constraint, in short packet communications (SPC). AVP can be considered one of the key performance indicators (KPIs) in 5G Ultra-Reliable Low Latency Communications (URLLC), and it is expected to gain more importance in 6G technologies, especially in extreme URLLC (xURLLC). Two distinct approaches are considered: the first focuses on adaptively selecting the blocklengths with either imperfect or missing channel state information exploiting finite blocklength theory approximations. The second involves dynamically choosing the modulation and coding scheme (MCS) to minimize the AVP under stringent timeliness constraints and nonasymptotic information theory bounds. In the context of adaptive blocklength selection, state-aggregated value iteration, Q-learning algorithms, and finite blocklength theory approximations are leveraged to adjust blocklengths to achieve low age violation probabilities adaptively. The simulation results highlight the effectiveness of these algorithms in minimizing age violation probabilities compared to the fixed blocklengths under varying channel conditions. Additionally, constructing a deep reinforcement learning (DRL) framework, we propose a deep Q-network policy for the dynamic selection of the modulation and coding scheme among the available MCSs defined for URLLC systems. Through comprehensive simulations, we demonstrate the superiority of the proposed adaptive methods over traditional benchmark methods.INDEX TERMS age of information, reinforcement learning, dynamic programming, finite blocklength, adaptive modulation and coding 1 This article has been accepted for publication in IEEE Access.

show abstract

Section: Adaptive Blocklength Selection For Minimizing Age Violation ...mentioning

confidence: 95%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Reinforcement Learning Based Adaptive Blocklength and MCS for Optimizing Age Violation Probability

Ozkaya,

Topbas,

Ceran

2023

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Additionally, the application of model-based reinforcement learning technology in 5G NR has been explored in [7] , which exhibits reduced communication delay compared to conventional methods. The literature [8] proposes a multi-agent deep reinforcement learning (DRL) framework to enhance throughput through power allocation and modulation coding scheme selection. Furthermore, [9] proposes a transmitter modulation coding scheme selection model, QL-AMC, which is derived from the Q-learning algorithm, resulting in enhanced spectral efficiency of the system.…”

Section: Introductionmentioning

confidence: 99%

Joint decision-making of communication waveform and power based on Q-learning

Wenchao,

Dou,

Chen

et al. 2024

Second International Conference on Informatics, Networking, and Computing (ICINC 2023)

View full text Add to dashboard Cite

This paper devises a decision-making method to address the varying communication requirements across diverse scenarios. The proposed method can objectively measure the performance of communication, and the weights of various indicators can be customized to adapt to different communication scenarios. Furthermore, in order to enhance communication performance, this paper has investigated a reinforcement learning models suitable for waveform and power decision-making, and proposes an explored action deletion algorithm based on the Q-learning method. The simulation results demonstrate that the proposed method outperforms existing methods in terms of convergence speed and algorithm accuracy.

show abstract