2019
DOI: 10.1177/2158244019851675
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrap Thompson Sampling and Sequential Decision Problems in the Behavioral Sciences

Abstract: Behavioral scientists are increasingly able to conduct randomized experiments in settings that enable rapidly updating probabilities of assignment to treatments (i.e., arms). Thus, many behavioral science experiments can be usefully formulated as sequential decision problems. This article reviews versions of the multiarmed bandit problem with an emphasis on behavioral science applications. One popular method for such problems is Thompson sampling, which is appealing for randomizing assignment and being asympto… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
4
1

Relationship

2
7

Authors

Journals

citations
Cited by 20 publications
(10 citation statements)
references
References 53 publications
0
10
0
Order By: Relevance
“…One perspective in the literature is that Thompson sampling, and its variants, describe a way of making decisions given uncertainty and that the use of a posterior for uncertainty quantification can be swapped with some alternative. One option is to use boostrap samples rather than posterior samples [Eckles andKaptein, 2019, Russo et al, 2018]. This has been used most substantially in reinforcement learning [Riquelme et al, 2018, Osband et al, 2016.…”
Section: Policy Learningmentioning
confidence: 99%
“…One perspective in the literature is that Thompson sampling, and its variants, describe a way of making decisions given uncertainty and that the use of a posterior for uncertainty quantification can be swapped with some alternative. One option is to use boostrap samples rather than posterior samples [Eckles andKaptein, 2019, Russo et al, 2018]. This has been used most substantially in reinforcement learning [Riquelme et al, 2018, Osband et al, 2016.…”
Section: Policy Learningmentioning
confidence: 99%
“…Motivated by the above, and inspired by the existing relationship between bootstrap distributions Rubin (1981) and Bayesian posteriors (i.e., bootstrap distributions can be used to approximate posteriors; Efron, 2012;Newton and Raftery, 1994), Eckles and Kaptein (2019) formulated a Bootstrap Thompson Sampling (BTS) technique for replacing the posterior by an online bootstrap distribution of the point estimate μt at each time t. In empirical evaluations, authors showed that, in comparison with LinTS and other methods, BTS it is more robust to model misspecifications, thanks to the robustness of the bootstrap approach, and it can be easily adapted to dependent observations, a common feature of behavioral sciences Kaptein, 2014, 2019).…”
Section: Contextual Bandits With Lints Explorationmentioning
confidence: 99%
“…In this case a bootstrap distribution over means can be used to approximate a posterior distribution [11,23]. Eckles et al [9,10] use a bootstrap distribution to replace the posterior distribution used in Thompson Sampling. This method is known as Bootstrap Thompson Sampling (BTS) [9] and was proposed in the multi-arm bandit setting.…”
Section: (Bootstrap) Thompson Samplingmentioning
confidence: 99%