1999
DOI: 10.1287/mnsc.45.11.1570
|View full text |Cite
|
Sign up to set email alerts
|

Simulation-Based Optimization with Stochastic Approximation Using Common Random Numbers

Abstract: The method of Common Random Numbers is a technique used to reduce the variance of difference estimates in simulation optimization problems. These differences are commonly used to estimate gradients of objective functions as part of the process of determining optimal values for parameters of a simulated system. Asymptotic results exist which show that using the Common Random Numbers method in the iterative Finite Difference Stochastic Approximation optimization algorithm (FDSA) can increase the optimal rate of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
43
0

Year Published

2006
2006
2013
2013

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 82 publications
(45 citation statements)
references
References 14 publications
1
43
0
Order By: Relevance
“…On the task of tuning the parameters of the opponent model, RSPSA resulted in a significantly better performance as compared to that obtained by using RFDSA. This confirms some of the previous findings such as those of Spall, Kleinman, Spall andNeiman (1992, 1999), whilst it contradicts some expectations published elsewhere, such as in Kushner and Yin (1997) and Dippon (2003). In the case of policy optimisation, RSPSA was competitive with TD-learning, although the combination of supervised learning followed by TD-learning outperformed RSPSA.…”
Section: Discussionsupporting
confidence: 81%
See 3 more Smart Citations
“…On the task of tuning the parameters of the opponent model, RSPSA resulted in a significantly better performance as compared to that obtained by using RFDSA. This confirms some of the previous findings such as those of Spall, Kleinman, Spall andNeiman (1992, 1999), whilst it contradicts some expectations published elsewhere, such as in Kushner and Yin (1997) and Dippon (2003). In the case of policy optimisation, RSPSA was competitive with TD-learning, although the combination of supervised learning followed by TD-learning outperformed RSPSA.…”
Section: Discussionsupporting
confidence: 81%
“…In fact, if this method is employed, the convergence rate is improved to O(t −1/2 ). This was shown for FDSA by Glasserman and Yao (1992) and L'Ecuyer and Yin (1998) and later extended to SPSA by Kleinman, Spall and Neiman (1999).…”
Section: Efficiencymentioning
confidence: 61%
See 2 more Smart Citations
“…If single parameters are perturbed, this method is known as the Kiefer-Wolfowitz procedure and if multiple parameters are perturbed simultaneously, it is known as Simultaneous Perturbation Stochastic gradient Approximation (SPSA), see Sadegh and Spall (1997) and Spall (2003) for in-depth treatment. This approach can be highly efficient in simulation optimization of deterministic systems (Spall, 2003) or when a common history of random numbers (Glynn, 1987;Kleinman, Spall, & Naiman, 1999) is being used (the later trick is known as the PEGASUS method in reinforcement learning, see Ng and Jordan (2000)), and can get close to a convergence rate of O(I −1/2 ) (Glynn, 1987). However, when used on a real system, the uncertainties degrade the performance resulting in convergence rates ranging between O(I −1/4 ) and O(I −2/5 ) depending on the chosen reference value (Glynn, 1987).…”
Section: Finite-difference Methodsmentioning
confidence: 99%