Revisiting Approximate Linear Programming: Constraint-Violation Learning with Applications to Inventory Control and Energy Storage

Lin, Qihang; Nadarajah, Selvaprabu; Soheili, Negar

doi:10.1287/mnsc.2019.3289

Cited by 17 publications

(19 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, the FGLP policy optimality gap is at most 5% across these instances and improves by up to 8% the previously known gaps from the ALP in Lin et al (2019).…”

Section: Introductionmentioning

confidence: 75%

“…Third, we showcase how specific classes of random basis functions can be embedded in FALP and FGLP so that these linear programs can be solved using a constraint-generation technique and an extended version of a constraint sampling method. The latter method combines the popular constraint sampling approach in De Farias and Van Roy (2004) and a recent lower bounding technique from Lin et al (2019), and would be of independent interest for solving ALPs with fixed basis functions. Finally, we have made publicly available Python code implementing FALP and FGLP as well as benchmark methods.…”

Section: Novelty and Contributionsmentioning

confidence: 99%

“…This material discusses how ideas in Lin et al (2019) Table 1 since the perishable inventory control cost function is Lipschitz with constant L c > 0. In fact, it is easy to verify that L c = 2(γ L c oā + c hā + c b s + c dā + c lā ) and consequently, L y = (4 β 1 +Lc) /1−γ.…”

Section: Ec51 a Valid Lower Bound Estimate For Constraint-sampled mentioning

confidence: 99%

“…The solution of ALP given a fixed set of basis functions in Step (ii) is challenging since it has a large number of constraints and has been a topic of active research. It can be approached, for example, using techniques such as constraint generation, constraint sampling, and constraint-violation learning (see Lin et al 2019 for a recent overview of ALP solution techniques). The initial selection and potential modification of basis functions in steps (i) and (iv), respectively, are implementation bottlenecks when using ALP but this issue has received limited attention in the literature (Klabjan and Adelman 2007, Adelman and Klabjan 2012, and Bhat et al 2012.…”

Section: Introductionmentioning

confidence: 99%

“…Our experiments employ the sixteen instances from Lin et al (2019) and consider as a benchmark the ALP in this paper that embeds basis functions tailored to the application. We find that the FALP policy cost fluctuates significantly, that is, the policy cost can worsen as the iterations of the procedure shown in Figure 1(b) progress.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Self-guided Approximate Linear Programs

et al. 2020

Self Cite

View full text Add to dashboard Cite

Approximate linear programs (ALPs) are well-known models based on value function approximations (VFAs) to obtain heuristic policies and lower bounds on the optimal policy cost of Markov decision processes (MDPs).The ALP VFA is a linear combination of predefined basis functions that are chosen using domain knowledge and updated heuristically if the ALP optimality gap is large. We side-step the need for such basis function engineering in ALP -an implementation bottleneck -by proposing a sequence of ALPs that embed increasing numbers of random basis functions obtained via inexpensive sampling. We provide a sampling guarantee and show that the VFAs from this sequence of models converge to the exact value function. Nevertheless, the performance of the ALP policy can fluctuate significantly as more basis functions are sampled. To mitigate these fluctuations, we "self-guide" our convergent sequence of ALPs using past VFA information such that a worstcase measure of policy performance is improved. We perform numerical experiments on perishable inventory control and generalized joint replenishment applications, which, respectively, give rise to challenging discounted-cost MDPs and average-cost semi-MDPs. We find that self-guided ALPs (i) significantly reduce policy cost fluctuations and improve the optimality gaps from an ALP approach that employs basis functions tailored to the former application, and (ii) deliver optimality gaps that are comparable to a known adaptive basis function generation approach targeting the latter application. More broadly, our methodology provides application-agnostic policies and lower bounds to benchmark approaches that exploit application structure. Bhat N, Farias V, Moallemi CC (2012) Non-parametric approximate dynamic programming via the kernel method. Advances in Neural Information Processing Systems, 386-394. Blado D, Toriello A (2019) Relaxation analysis for the dynamic knapsack problem with stochastic item sizes. SIAM Journal on Optimization 29(1):1-30. Chen X, Pang Z, Pan L (2014) Coordinating inventory control and pricing strategies for perishable products. Operations Research 62(2):284-300. De Farias DP, Van Roy B (2003) The linear programming approach to approximate dynamic programming. Operations Research 51(6):850-865. De Farias DP, Van Roy B (2004) On constraint sampling in the linear programming approach to approximate dynamic programming. Mathematics of Operations Research 29(3):462-478. Desai VV, Farias VF, Moallemi CC (2012) Approximate dynamic programming via a smoothed linear program. Operations Research 60(3):655-674. Folland GB (1999) Real Analysis: Modern Techniques and Their Applications (New York, NY: John Wiley & Sons). Forsell N, Sabbadin R (2006) Approximate linear-programming algorithms for graph-based Markov decision processes. Karaesmen IZ, Scheller-Wolf A, Deniz B (2011) Managing perishable and aging inventories: review and future research directions, 393-436 (New York, NY: Springer). Klabjan D, Adelman D (2007) An infinite-dimensional linear programming al...

show abstract

“…Moreover, the FGLP policy optimality gap is at most 5% across these instances and improves by up to 8% the previously known gaps from the ALP in Lin et al (2019).…”

Section: Introductionmentioning

confidence: 75%

Section: Novelty and Contributionsmentioning

confidence: 99%

Section: Ec51 a Valid Lower Bound Estimate For Constraint-sampled mentioning

confidence: 99%