Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may improve future performance. The algorithm addresses a broad range of problems in a computationally efficient manner and is therefore enjoying wide use. This tutorial covers the algorithm and its application, illustrating concepts through a range of examples, including Bernoulli bandit problems, shortest path problems, product assortment, recommendation, active learning with neural networks, and reinforcement learning in Markov decision processes. Most of these problems involve complex information structures, where information revealed by taking an action informs beliefs about other actions. We will also discuss when and why Thompson sampling is or is not effective and relations to alternative algorithms.
BackgroundFecal microbiota transplantation is an effective treatment for recurrent Clostridium difficile infection and is being investigated as a treatment for other microbiota-associated diseases. To facilitate these activities, an international public stool bank has been created, which screens donors and processes stools in a standardized manner. The goal of this research is to use mathematical modeling and analysis to optimize screening and donor management at the stool bank.ResultsCompared to the current policy of screening active donors every 60 days before releasing their quarantined stools for sale, costs can be reduced by 10.3 % by increasing the screening frequency to every 36 days. In addition, the stool production rate varies widely across donors, and using donor-specific screening, where higher producers are screened more frequently, also reduces costs, as does introducing an interim (i.e., between consecutive regular tests) stool test for just rotavirus and C. difficile. We also derive a donor release (i.e., into the system) policy that allows the supply to approximately match an exponentially increasing deterministic demand.ConclusionsMore frequent screening, interim screening for rotavirus and C. difficile, and donor-specific screening, where higher stool producers are screened more frequently, are all cost-reducing measures. If screening costs decrease in the future (e.g., as a result of bringing screening in house), a bottleneck for implementing some of these recommendations may be the reluctance of donors to undergo serum screening more frequently than monthly.Electronic supplementary materialThe online version of this article (doi:10.1186/s40168-015-0140-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.