We investigate the value of labels in a simple version of the standard on-line prediction model (the "experts" setting). We present algorithms and adversary arguments defining tradeoffs between the number of mistakes made and the number of labels that the learner requests. One version of this question can be viewed as a family of games whose value is given by a complicated recurrence.Although our attempts to tind a closed form for this recurrence have been unsuccessful, we show how an algorithm can efficiently compute its value, enabling it to perform optimally.
We examine a general Bayesian framework for constructing on-line prediction algorithms in the experts setting. These algorithms predict the bits of an unknown Boolean sequence using the advice of a finite set of experts. In this framework we use probabilistic assumptions on the unknown sequence to motivate prediction strategies. However, the relative bounds that we prove on the number of prediction mistakes made by these strategies hold for any sequence. The Bayesian framework provides a unified derivation and analysis of previously known prediction strategies, such as the Weighted Majority and Binomial Weighting algorithms. Furthermore, it provides a principled way of automatically adapting the parameters of Weighted Majority to the sequence, in contrast to previous ad hoc doubling techniques. Finally, we discuss the generalization of our methods to algorithms making randomized predictions.
Introduction.A fundamental problem in learning theory is to predict the bits of an unknown Boolean sequence. The problem is uninteresting when the algorithm is required to minimize its worst-case number of mistakes over all sequences, as no algorithm can do better than random guessing. A richer problem results if the algorithm is given a (finite) set of models and the sequence is reasonably close to that generated by one of the models. Now interesting "relative" mistake bounds that depend on the distance between the unknown Boolean sequence and the closest model can be proven. This is sometimes referred to as the "experts" setting, since the models can be viewed as "experts" providing "advice" to the algorithm. Variants and extensions of this experts setting have been extensively studied by Littlestone and Warmuth [10], Vovk [12], CesaBianchi et al.[2], [3], Haussler et al. [6], and others in the area of computational learning theory. Here we use a Bayesian approach to derive prediction algorithms with good performance in the experts settings. A crucial aspect of this work is that although the algorithms are derived by making probabilistic assumptions about the generation of the sequence to be predicted, they are analyzed in the adversarial experts setting.In this experts setting, a "master algorithm" attempts to predict, one by one, the bits of an unknown sequence. Before predicting each bit, the master is allowed to listen to the "advice" provided by a pool of N experts. After each bit is revealed, the master
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.