Approximate Dynamic Programming

Powell, Warren B.

doi:10.1002/9781118029176

Cited by 1,009 publications

(205 citation statements)

References 121 publications

Supporting

Mentioning

202

Contrasting

Unclassified

Order By: Relevance

“…This type of scenario exploration would tell managers the permissible level of fishing given a target biomass and recent ocean conditions. This approach is analogous to the greedy heuristic strategies that are used in high-dimensional approximate optimization problems (26). Of course, considerable work remains to be done to develop and evaluate management plans based on these methods.…”

Section: Discussionmentioning

confidence: 99%

Predicting climate effects on Pacific sardine

Deyle

Fogarty

Hsieh

et al. 2013

Proc. Natl. Acad. Sci. U.S.A.

181

201

View full text Add to dashboard Cite

For many marine species and habitats, climate change and overfishing present a double threat. To manage marine resources effectively, it is necessary to adapt management to changes in the physical environment. Simple relationships between environmental conditions and fish abundance have long been used in both fisheries and fishery management. In many cases, however, physical, biological, and human variables feed back on each other. For these systems, associations between variables can change as the system evolves in time. This can obscure relationships between population dynamics and environmental variability, undermining our ability to forecast changes in populations tied to physical processes. Here we present a methodology for identifying physical forcing variables based on nonlinear forecasting and show how the method provides a predictive understanding of the influence of physical forcing on Pacific sardine.ecosystem-based management | physical-biological interactions | state space reconstruction | complex systems | time series analysis E cosystem-based management (EBM) is an essential challenge that places strong demands on our understanding of coupled social-ecological systems. EBM requires an understanding of how human activities such as fishing influence and are influenced by other parts of the ecosystem. This includes accounting for the effects of the physical environment on exploited populations. However, the interactions between ecosystem components can be complex, and unraveling physical-biological interactions remains a challenge. For instance, a retrospective study of 35 exploited and unexploited species in the California Current shows that fishing pressure can amplify the influence of environmental forcing on populations by truncating the age structure (1). This study and others (2-4) demonstrate that the effect of environmental forcing on populations can be contingent on fishing effort, current abundance, and age structure. This raises an important issue: Ecosystem variables are not separate, decomposable forces. Instead, their interactions are state-dependent, meaning that the impact of one variable on another depends on the state of the variables.State-dependent behavior can confound many traditional statistical methods. Witness that valid correlations between physical and biological variables can be difficult to find (5) and can appear and disappear with time (6). In fact, nonlinear systems (systems with state-dependent interactions) can produce mirage correlations: variables that seem positively correlated over one period in time may seem negatively correlated or unrelated over another period (7). A meta-analysis of environment-recruitment relationships in marine populations shows that these correlations hold up poorly when retested with new data (6). Consequently, EBM requires more robust methods for identifying driving variables and understanding their influence on population and community dynamics. Here, we show that methods based on multivariate state space reconstruction (SSR) (8) offer ...

show abstract

Section: Discussionmentioning

confidence: 99%

Predicting climate effects on Pacific sardine

Deyle

Fogarty

Hsieh

et al. 2013

Proc. Natl. Acad. Sci. U.S.A.

181

201

View full text Add to dashboard Cite

show abstract

“…This algorithm is particularly interesting when the number of states is huge. In this case, classical algorithms like Minimax and Alphabeta [9], for two-player games, and Dynamic Programming [13], for one-player games, are too time-consuming or not efficient. MCTS combines an exploration of the tree based on a compromise between exploration and exploitation, and an evaluation based on Monte-Carlo simulations.…”

Section: Introductionmentioning

confidence: 99%

Biasing Monte-Carlo Simulations through RAVE Values

Rimmel

Teytaud

2011

Computers and Games

View full text Add to dashboard Cite

Abstract. The Monte-Carlo Tree Search algorithm has been successfully applied in various domains. However, its performance heavily depends on the Monte-Carlo part. In this paper, we propose a generic way of improving the Monte-Carlo simulations by using RAVE values, which already strongly improved the tree part of the algorithm. We prove the generality and efficiency of our approach by showing improvements on two different applications: the game of Havannah and the game of Go.

show abstract

“…It is weaker because it does not have to assign the precise quantity "discounted future sum of rewards"; any number will do as long as it helps to grow the tree in roughly the right direction. It is for this reason that we believe it could be advantageous to try to learn a good scoring function instead of trying to directly learn/approximate the optimal value function: the former can be a rather simple function (as evidenced in the good results we get in Section 5 where we use for all domains exactly the same simple weighted sum of features), whereas the latter would require a far more expressive parametrization (and it is well-known that value function approximation scales badly when the dimensionality of the state space grows [41]). Figure 2 presents a simple algorithm based on a sorted list to implement policies as parameterized look-ahead trees.…”

Section: Connection With the Optimal Value Functionmentioning

confidence: 99%

Optimized look‐ahead tree policies: a bridge between look‐ahead tree policies and direct policy search

Jung

Wehenkel

Ernst

et al. 2013

Adaptive Control & Signal

View full text Add to dashboard Cite

Direct policy search (DPS) and look-ahead tree (LT) policies are two widely used classes of techniques to produce high performance policies for sequential decision-making problems. To make DPS approaches work well, one crucial issue is to select an appropriate space of parameterized policies with respect to the targeted problem. A fundamental issue in LT approaches is that, to take good decisions, such policies must develop very large look-ahead trees which may require excessive online computational resources. In this paper, we propose a new hybrid policy learning scheme that lies at the intersection of DPS and LT, in which the policy is an algorithm that develops a small look-ahead tree in a directed way, guided by a node scoring function that is learned through DPS. The LT-based representation is shown to be a versatile way of representing policies in a DPS scheme, while at the same time, DPS enables to significantly reduce the size of the look-ahead trees that are required to take high-quality decisions.We experimentally compare our method with two other state-of-the-art DPS techniques and four common LT policies on four benchmark domains and show that it combines the advantages of the two techniques from which it originates. In particular, we show that our method: (1) produces overall better performing policies than both pure DPS and pure LT policies, (2) requires a substantially smaller number of policy evaluations than other DPS techniques, (3) is easy to tune and (4) results in policies that are quite robust with respect to perturbations of the initial conditions.

show abstract

Approximate Dynamic Programming

Cited by 1,009 publications

References 121 publications

Predicting climate effects on Pacific sardine

Predicting climate effects on Pacific sardine

Biasing Monte-Carlo Simulations through RAVE Values

Optimized look‐ahead tree policies: a bridge between look‐ahead tree policies and direct policy search

Contact Info

Product

Resources

About