2021
DOI: 10.1016/j.apenergy.2021.117519
|View full text |Cite
|
Sign up to set email alerts
|

Applying reinforcement learning and tree search to the unit commitment problem

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(14 citation statements)
references
References 21 publications
0
14
0
Order By: Relevance
“…On the other hand, if rand P3 is less than P 3 , an aggressive siege fight is chosen, as depicted in ( 14) and ( 15) [34]. (15) Where BestVulture 1 (i) and BestVulture 2 (i) represent the firstand second-best vultures in the current iteration, P(i) is the current position of the vultures, and R(i), d(t), and F are variables defined previously. The levy flight equation is used to enhance the efficiency of the AVOA algorithm and is computed according to Equation (16).…”
Section: Phase 3: Explorationmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, if rand P3 is less than P 3 , an aggressive siege fight is chosen, as depicted in ( 14) and ( 15) [34]. (15) Where BestVulture 1 (i) and BestVulture 2 (i) represent the firstand second-best vultures in the current iteration, P(i) is the current position of the vultures, and R(i), d(t), and F are variables defined previously. The levy flight equation is used to enhance the efficiency of the AVOA algorithm and is computed according to Equation (16).…”
Section: Phase 3: Explorationmentioning
confidence: 99%
“…The algorithm can self-learn, and it was tested on the IEEE 118-bus system and the State Grid Hunan Electric Power Company. The authors in [15] used reinforcement learning with tree search to solve unit commitment under RES and load uncertainties. The effectiveness of this algorithm was tested on a 10-unit system.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, many researchers have adopted data-driven approaches like reinforcement learning (RL) to overcome the limitations of traditional approaches in PSD [7]. In [8], a distributed Q-learning algorithm is adopted to improve robustness through local information communication; In [9], the policy gradient algorithm is adopted to achieve a fast and efficient search of the action space through a bootstrap tree search approach; In [10], multi-scene parallel optimal scheduling is implemented; In [11], better convergence and economy are achieved through multi-intelligence reinforcement learning.…”
Section: Introductionmentioning
confidence: 99%
“…This makes UC heavily rely on accurate forecasts while the day-ahead forecasts of renewable generation are presently unreliable. Alternative methods, including reinforcement learning [7][8][9], however, have still been proposed as offline control, nor considered complex network constraints such as transmission line capacity, which is infeasible in actual scheduling. On the other hand, ED is usually modeled as a convex optimization problem and can be solved by the interior point and dual methods [10][11][12], but these methods still encounter problems of convergence and computational speed in the face of AC power flow models, in which ED turns into a non-convex problem.…”
mentioning
confidence: 99%