2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) 2014
DOI: 10.1109/adprl.2014.7010626
|View full text |Cite
|
Sign up to set email alerts
|

A comparison of approximate dynamic programming techniques on benchmark energy storage problems: Does anything work?

Abstract: As more renewable, yet volatile, forms of energy like solar and wind are being incorporated into the grid, the problem of finding optimal control policies for energy storage is becoming increasingly important. These sequential decision problems are often modeled as stochastic dynamic programs, but when the state space becomes large, traditional (exact) techniques such as backward induction, policy iteration, or value iteration quickly become computationally intractable. Approximate dynamic programming (ADP) th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
38
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 42 publications
(39 citation statements)
references
References 24 publications
1
38
0
Order By: Relevance
“…HAPI itself can be classified in this framework as a critic-only algorithm as it solely uses an approximation of the value function but not a separate approximation of the policy. Other modifications of HAPI could include, for example, the usage of different basis functions, such as higher order polynomials (see, e.g., Löhndorf and Minner 2010) and Gaussian radial basis functions (see, e.g., Jiang et al 2014). Likewise, researchers could investigate adapting this approach to a direct policy search (see, e.g., Scott and Powell 2012;Nascimento and Powell 2013) or to an approximate value iteration (AVI, see, e.g., Jiang and Powell 2015).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…HAPI itself can be classified in this framework as a critic-only algorithm as it solely uses an approximation of the value function but not a separate approximation of the policy. Other modifications of HAPI could include, for example, the usage of different basis functions, such as higher order polynomials (see, e.g., Löhndorf and Minner 2010) and Gaussian radial basis functions (see, e.g., Jiang et al 2014). Likewise, researchers could investigate adapting this approach to a direct policy search (see, e.g., Scott and Powell 2012;Nascimento and Powell 2013) or to an approximate value iteration (AVI, see, e.g., Jiang and Powell 2015).…”
Section: Discussionmentioning
confidence: 99%
“…Surprisingly, in numerical experiments with real-world prices, they show that a very inefficient battery, which more or less only burns energy, has the highest value, because it exploits negative prices. Jiang et al (2014) directly compare the performance of different ADP approaches for the optimal control of an energy storage device. Jiang and Powell (2015) focus on one of these approaches and exploit the monotonicity of the value function.…”
Section: Literature Reviewmentioning
confidence: 99%
“…There has been recently a lot of interest in the development of efficient methods for stochastic optimal control problems such as stochastic gradient methods [23], the alternating directions method of multipliers (ADMM) [24] and various decomposition methods which can lead to parallelizable methods [25,26] (the most popular being the stochastic dual approximate dynamic programming [27], the progressive hedging approach [28] and dynamic programming [29]). There have been proposed parallelizable interior point algorithms for two-stage stochastic optimal control problems such as [30,31,32,33] and an ad hoc interior point solver for multi-stage problems [34].…”
Section: State Of the Artmentioning
confidence: 99%
“…This is a parameterized cost function approximation where the reserve parameters are tuned in an online fashion (that is, the real world), although they could be tuned in a simulator (offline). • Value function approximations-Often referred to as dynamic programming (or approximate dynamic programming), value functions are particularly useful in the control of storage problems (see [34]- [36]). Value functions are widely approximated in the controls community using neural networks where they are often referred to as "critic nets."…”
Section: Illustrating the Four Classes In Energy Applicationsmentioning
confidence: 99%