2021
DOI: 10.23919/jcc.2021.06.002
|View full text |Cite
|
Sign up to set email alerts
|

Q-greedyUCB: A new exploration policy to learn resource-efficient scheduling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…Step 2: A new action m is selected by the ICV, according to the Q-greedyUCB policy [ 31 ] and executing;…”
Section: Real-time Trajectory Prediction Methods For Intelligent Conn...mentioning
confidence: 99%
See 1 more Smart Citation
“…Step 2: A new action m is selected by the ICV, according to the Q-greedyUCB policy [ 31 ] and executing;…”
Section: Real-time Trajectory Prediction Methods For Intelligent Conn...mentioning
confidence: 99%
“…The route with less time cost is defined as a better scheme for ICVs. In this section, the Q-greedyUCB algorithm [ 31 ] is selected as the action policy in the Q-Learning algorithm. In the processing of LSTM model training, five driving behaviors (straight ahead, left lane change, right lane change, left turn, and right turn) are considered to achieve trajectory prediction.…”
Section: Real-time Trajectory Prediction Methods For Intelligent Conn...mentioning
confidence: 99%
“…On the basis of reinforcement learning, William and Setiawan [29] applied a new method to improve the efficiency of job-shop scheduling, Baer et al [30] proposed a new approach for online scheduling in flexible manufacturing systems (FMS) based on reinforcement learning (RL), Liu et al [31] proposed a novel algorithm to address the workflow scheduling problem. Zhao, Lee and Chen [32] proposed a unique algorithm based on the Markov decision process to find an optimal scheduling policy to minimize the delay for a given energy constraint in a communication system. Deep reinforcement learning has been widely used in scheduling optimization in recent years.…”
Section: Research On Ground Stations Scheduling Problemmentioning
confidence: 99%
“…However, these works only consider energy consumption optimization, which ignore delay and other QoS requirements. In [15], Zhao et al proposed a delay minimization algorithm based on UCB, but neglected the joint optimization of energy consumption and delay. In [16], Bae et al proposed a downlink network routing algorithm based on UCB to jointly optimize throughput and delay, but ignored the influence of complex EMI and service priority.…”
Section: Introductionmentioning
confidence: 99%