2014
DOI: 10.1007/978-3-319-11662-4_15
|View full text |Cite
|
Sign up to set email alerts
|

On Learning the Optimal Waiting Time

Abstract: Abstract. Consider the problem of learning how long to wait for a bus before walking, experimenting each day and assuming that the bus arrival times are independent and identically distributed random variables with an unknown distribution. Similar uncertain optimal stopping problems arise when devising power-saving strategies, e.g., learning the optimal disk spin-down time for mobile computers, or speeding up certain types of satisficing search procedures by switching from a potentially fast search method that… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(23 citation statements)
references
References 11 publications
0
23
0
Order By: Relevance
“…A very recent line of research tries to characterize the statistical complexity of interactive decision making, with both upper and lower bounds, based on either the decision-estimation coefficient (DEC) and its variants [FKQR21,FGQ + 22,FRSS22,CMB22,FGH23], or the generalized information ratio [LG21,Lat22]. Although these result typically lead to the right regret dependence on T for general bandit problems, the dependence on d could be loose in both their upper and lower bounds.…”
Section: Related Workmentioning
confidence: 99%
“…A very recent line of research tries to characterize the statistical complexity of interactive decision making, with both upper and lower bounds, based on either the decision-estimation coefficient (DEC) and its variants [FKQR21,FGQ + 22,FRSS22,CMB22,FGH23], or the generalized information ratio [LG21,Lat22]. Although these result typically lead to the right regret dependence on T for general bandit problems, the dependence on d could be loose in both their upper and lower bounds.…”
Section: Related Workmentioning
confidence: 99%
“…However, the use of Cauchy-Schwarz inequality over the whole action set A is agnostic to the distribution of A * and thus illuminates the effect of a prior. As far as we know, all the existing upper bound analysis of information ratio (Tossou et al, 2017;Lattimore & Szepesvári, 2019;Hao et al, 2021;Hao & Lattimore, 2022) are prior-independent.…”
Section: Why Existing Analysis Is Not Sufficientmentioning
confidence: 99%
“…There are other works in the RL literature studying RL problems in the linear setting. [53,33,50] consider the linear setting with a generative model in the time-homogenous case. These works use L ∞ estimation instead of UCB estimation to handle the distribution mismatch phenomenon.…”
Section: Estimation and Optimization Errormentioning
confidence: 99%