The increasingly tight coupling between humans and system operations in domains ranging from intelligent infrastructure to e-commerce has led to a challenging new class of problems founded on a well-established area of research: incentive design. There is a clear need for a new tool kit for designing mechanisms that help coordinate self-interested parties while avoiding unexpected outcomes in the face of information asymmetries, exogenous uncertainties from dynamic environments, and resource constraints. This article provides a perspective on the current state of the art in incentive design from three core communities—economics, control theory, and machine learning—and highlights interesting avenues for future research at the interface of these domains.
In this paper we introduce the transductive linear bandit problem: given a set of measurement vectors X ⊂ R d , a set of items Z ⊂ R d , a fixed confidence δ, and an unknown vector θ * ∈ R d , the goal is to infer argmax z∈Z z θ * with probability 1 − δ by making as few sequentially chosen noisy measurements of the form x θ * as possible. When X = Z, this setting generalizes linear bandits, and when X is the standard basis vectors and Z ⊂ {0, 1} d , combinatorial bandits. Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost. As an example, in drug discovery the compounds and dosages X a practitioner may be willing to evaluate in the lab in vitro due to cost or safety reasons may differ vastly from those compounds and dosages Z that can be safely administered to patients in vivo. Alternatively, in recommender systems for books, the set of books X a user is queried about may be restricted to well known best-sellers even though the goal might be to recommend more esoteric titles Z. In this paper, we provide instance-dependent lower bounds for the transductive setting, an algorithm that matches these up to logarithmic factors, and an evaluation. In particular, we provide the first non-asymptotic algorithm for linear bandits that nearly achieves the information theoretic lower bound.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.