2021
DOI: 10.48550/arxiv.2103.05847
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Two-stage Framework and Reinforcement Learning-based Optimization Algorithms for Complex Scheduling Problems

Yongming He,
Guohua Wu,
Yingwu Chen
et al.

Abstract: There hardly exists a general solver that is efficient for scheduling problems due to their diversity and complexity. In this study, we develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together to efficiently deal with complex scheduling problems. The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively. This offers a novel and general… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 34 publications
0
2
0
Order By: Relevance
“…It can effectively solve difficult sample data acquisition problems in the constellation’s early warning system by continuously updating its decision network via the interaction between the intelligence and the environment. Commonly used deep-reinforcement-learning algorithms currently include the following: deep Q-network (DQN) [ 17 , 18 , 19 , 20 ], deep deterministic policy gradient (DDPG) [ 21 , 22 , 23 ], proximal policy optimization (PPO) [ 24 , 25 , 26 ], and soft actor–critic (SAC). These algorithms are widely used in the cooperative positioning of moving targets, agile Earth observation satellite mission scheduling, and the collaborative scheduling of ground satellites.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…It can effectively solve difficult sample data acquisition problems in the constellation’s early warning system by continuously updating its decision network via the interaction between the intelligence and the environment. Commonly used deep-reinforcement-learning algorithms currently include the following: deep Q-network (DQN) [ 17 , 18 , 19 , 20 ], deep deterministic policy gradient (DDPG) [ 21 , 22 , 23 ], proximal policy optimization (PPO) [ 24 , 25 , 26 ], and soft actor–critic (SAC). These algorithms are widely used in the cooperative positioning of moving targets, agile Earth observation satellite mission scheduling, and the collaborative scheduling of ground satellites.…”
Section: Introductionmentioning
confidence: 99%
“…Adding local search algorithms to reinforcement learning can further enhance the search capability of the algorithm based on the above advantages. He et al developed a two-stage scheduling algorithm framework that combines Q-learning and traditional mixed-integer programming to achieve multi-satellite task scheduling [ 18 ].…”
Section: Introductionmentioning
confidence: 99%