Active Search for High Recall: A Non-stationary Extension of Thompson Sampling

Renders, Jean-Michel

doi:10.1007/978-3-319-76941-7_68

Cited by 1 publication

(1 citation statement)

References 14 publications

(12 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Bandit algorithms can manage the trade-off between exploration and exploitation in active search [2-4, 10, 11, 16, 17]. However, prior bandit based active search [12,15] committed to a single bandit model by making an implicit assumption about the data distribution, which is unknown before the bandit is selected. As a result, the selected bandit can be sub-optimal with respect to the true data distribution.…”

Section: Introductionmentioning

confidence: 99%

Active Search using Meta-Bandits

Zhu

Coles

Xie

2020

Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

There are many applications where positive instances are rare but important to identify. For example, in NLP, positive sentences for a given relation are rare in a large corpus. Positive data are more informative for learning in these applications, but before one labels a certain amount of data, it is unknown where to find the rare positives. Since random sampling can lead to significant waste in labeling effort, previous "active search" methods use a single bandit model to learn about the data distribution (exploration) while sampling from the regions potentially containing more positives (exploitation). Many bandit models are possible and a sub-optimal model reduces labeling efficiency, but the optimal model is unknown before any data are labeled. We propose Meta-AS (Meta Active Search) that uses a meta-bandit to evaluate a set of base bandits and aims to label positive examples efficiently, comparing to the optimal base bandit with hindsight. The meta-bandit estimates the mean and variance of the performance of the base bandits, and selects a base bandit to propose what data to label next for exploration or exploitation. The feedback in the labels updates both the base bandits and the meta-bandit for the next round. Meta-AS can accommodate a diverse set of base bandits to explore assumptions about the dataset, without over-committing to a single model before labeling starts. Experiments on five datasets for relation extraction demonstrate that Meta-AS labels positives more efficiently than the base bandits and other bandit selection strategies. CCS CONCEPTS • Information systems → Crowdsourcing; • Theory of computation → Sequential decision making; Online learning theory; Active learning.

show abstract

Section: Introductionmentioning

confidence: 99%