2018
DOI: 10.1137/16m1073042
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Learning for Stochastic Optimization with Nonlinear Parametric Belief Models

Abstract: We consider the problem of estimating the expected value of information (the knowledge gradient) for Bayesian learning problems where the belief model is nonlinear in the parameters. Our goal is to maximize some metric, while simultaneously learning the unknown parameters of the nonlinear belief model, by guiding a sequential experimentation process which is expensive. We overcome the problem of computing the expected value of an experiment, which is computationally intractable, by using a sampled approximatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…This can be a reasonable approximation in one or two dimensions, but tends to become quite poor in three or more dimensions. He and Powell (2018) relaxes the restriction of a fixed set of θ's by adaptively resampling a large pool and choosing new values based on mean squared error. While this logic was shown to be useful for higher dimensional problems (e.g.…”
Section: Bidding Policy: Knowledge Gradient With Bootstrap Aggregationmentioning
confidence: 99%
See 1 more Smart Citation
“…This can be a reasonable approximation in one or two dimensions, but tends to become quite poor in three or more dimensions. He and Powell (2018) relaxes the restriction of a fixed set of θ's by adaptively resampling a large pool and choosing new values based on mean squared error. While this logic was shown to be useful for higher dimensional problems (e.g.…”
Section: Bidding Policy: Knowledge Gradient With Bootstrap Aggregationmentioning
confidence: 99%
“…In Section 4.2, we provide the first asymptotic convergence theorem for the contextual knowledge gradient policy. In Section 4.3, we design a resampling procedure specifically for convex loss functions using bootstrap aggregation to find the most promising parameter values, without the need to maintain or enumerate a large pool as used in He and Powell (2018).…”
Section: Bidding Policy: Knowledge Gradient With Bootstrap Aggregationmentioning
confidence: 99%