2023
DOI: 10.1109/tvt.2022.3223652
|View full text |Cite
|
Sign up to set email alerts
|

Multiuser Scheduling Algorithm for 5G IoT Systems Based on Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 32 publications
0
3
0
Order By: Relevance
“…According to [ 51 ], we can approximate the computational complexity of the Q-learning algorithm as per iteration, where S is the number of states, A is the number of actions, and H is the number of steps per episode. According to the state space and action space defined in our simulation environment, the amount of work per iteration can be approximated as [ 51 , 52 ], where represents a number of antennas in BS, represent a number of user devices in the cell, and represents the size of channel coefficients. On the other hand, the computational complexity for the benchmark scheme implemented in [ 15 ], is described as follows: the sizes of the input layer, the first hidden layer, the second hidden layer, and the output layer for each network implemented in [ 15 ] is denoted as , , , and respectively.…”
Section: Simulation Results and Discussionmentioning
confidence: 99%
“…According to [ 51 ], we can approximate the computational complexity of the Q-learning algorithm as per iteration, where S is the number of states, A is the number of actions, and H is the number of steps per episode. According to the state space and action space defined in our simulation environment, the amount of work per iteration can be approximated as [ 51 , 52 ], where represents a number of antennas in BS, represent a number of user devices in the cell, and represents the size of channel coefficients. On the other hand, the computational complexity for the benchmark scheme implemented in [ 15 ], is described as follows: the sizes of the input layer, the first hidden layer, the second hidden layer, and the output layer for each network implemented in [ 15 ] is denoted as , , , and respectively.…”
Section: Simulation Results and Discussionmentioning
confidence: 99%
“…Elsayed et al [39] utilized an energy-efficient Q-Learning algorithm based on the UCB strategy to dynamically activate and deactivate SCCs (sub-carrier components) in order to maximize user throughput with the minimum number of active SCCs, thereby maximizing energy efficiency. Li et al [40] proposed a user scheduling algorithm based on the UCB strategy in reinforcement learning that continuously updates the estimated behavioral value of each user. This approach avoids the phenomenon of "extreme unfairness" during the exploration phase, reduces algorithm complexity, and improves system throughput.…”
Section: Related Workmentioning
confidence: 99%
“…To illustrate our approach and contributions, we focus on a state-of-the-art technology example, a multiuser scheduling scheme [13][14][15]. In conventional parallel multiuser scheduling, a scheduled user's signal is detected by correlating signals from all scheduled users, leading to multiple-user interference (MUI).…”
Section: Introductionmentioning
confidence: 99%