2021
DOI: 10.48550/arxiv.2110.15501
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Doubly Robust Interval Estimation for Optimal Policy Evaluation in Online Learning

Abstract: Evaluating the performance of an ongoing policy plays a vital role in many areas such as medicine and economics, to provide crucial instruction on the early-stop of the online experiment and timely feedback from the environment. Policy evaluation in online learning thus attracts increasing attention by inferring the mean outcome of the optimal policy (i.e., the value) in real-time. Yet, such a problem is particularly challenging due to the dependent data generated in the online environment, the unknown optimal… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 30 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?