How to use expert advice

Cesa-Bianchi, Nicolò; Freund, Yoav; Haussler, David; Helmbold, David P.; Schapire, Robert E.; Warmuth, Manfred K.

doi:10.1145/258128.258179

Cited by 494 publications

(327 citation statements)

References 40 publications

Supporting

Mentioning

321

Contrasting

Order By: Relevance

“…Despite the fact that we derive our technique for a special family of algorithms, we believe that this style of analysis can give insight into the performance of other algorithms as well. Although the performance bounds for this new family improve when the absolute loss measure (instead of the 0-1 loss) is used, we have been unable to show the factor of 1 2 improvement that the exponential update algorithms exhibit (see [2]). …”

Section: Introductionmentioning

confidence: 77%

“…Different algorithms for the on-line prediction model with experts have recently been proposed [2], [7], [10], [12], both for the simple setting in which all the predictions are Boolean and for the more general setting in which the experts' advice and/or the master's predictions are chosen in the interval [0,1]. All these algorithms share the same general multiplicative weighting scheme which we mentioned in the Introduction.…”

Section: An Overview Of the Experts Settingmentioning

confidence: 99%

“…A good choice for F is the sigmoidal function, −log(1−x), previously used in [2] and [12]. This function has a nice theoretical motivation, as it is the amount of information (measured in bits) gained when an event with probability 1 − x occurs.…”

Section: Performance Under Absolute Lossmentioning

confidence: 99%

“…This is sometimes referred to as the "experts" setting, since the models can be viewed as "experts" providing "advice" to the algorithm. Variants and extensions of this experts setting have been extensively studied by Littlestone and Warmuth [10], Vovk [12], CesaBianchi et al [2], [3], Haussler et al [6], and others in the area of computational learning theory. Here we use a Bayesian approach to derive prediction algorithms with good performance in the experts settings.…”

Section: Introductionmentioning

confidence: 99%

“…When tuned optimally, multiplicative algorithms have asymptotically optimal relative loss bounds for some 0-1 loss [3], [10] and absolute loss [2], [7] settings. Vovk [13] shows that multiplicative algorithms are optimal in a different way.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

On Bayes Methods for On-Line Boolean Prediction

1998

View full text Add to dashboard Cite

We examine a general Bayesian framework for constructing on-line prediction algorithms in the experts setting. These algorithms predict the bits of an unknown Boolean sequence using the advice of a finite set of experts. In this framework we use probabilistic assumptions on the unknown sequence to motivate prediction strategies. However, the relative bounds that we prove on the number of prediction mistakes made by these strategies hold for any sequence. The Bayesian framework provides a unified derivation and analysis of previously known prediction strategies, such as the Weighted Majority and Binomial Weighting algorithms. Furthermore, it provides a principled way of automatically adapting the parameters of Weighted Majority to the sequence, in contrast to previous ad hoc doubling techniques. Finally, we discuss the generalization of our methods to algorithms making randomized predictions. Introduction.A fundamental problem in learning theory is to predict the bits of an unknown Boolean sequence. The problem is uninteresting when the algorithm is required to minimize its worst-case number of mistakes over all sequences, as no algorithm can do better than random guessing. A richer problem results if the algorithm is given a (finite) set of models and the sequence is reasonably close to that generated by one of the models. Now interesting "relative" mistake bounds that depend on the distance between the unknown Boolean sequence and the closest model can be proven. This is sometimes referred to as the "experts" setting, since the models can be viewed as "experts" providing "advice" to the algorithm. Variants and extensions of this experts setting have been extensively studied by Littlestone and Warmuth [10], Vovk [12], CesaBianchi et al.[2], [3], Haussler et al. [6], and others in the area of computational learning theory. Here we use a Bayesian approach to derive prediction algorithms with good performance in the experts settings. A crucial aspect of this work is that although the algorithms are derived by making probabilistic assumptions about the generation of the sequence to be predicted, they are analyzed in the adversarial experts setting.In this experts setting, a "master algorithm" attempts to predict, one by one, the bits of an unknown sequence. Before predicting each bit, the master is allowed to listen to the "advice" provided by a pool of N experts. After each bit is revealed, the master

show abstract

Section: Introductionmentioning

confidence: 77%

Section: An Overview Of the Experts Settingmentioning

confidence: 99%

Section: Performance Under Absolute Lossmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

On Bayes Methods for On-Line Boolean Prediction

1998

View full text Add to dashboard Cite

show abstract

Gradient‐free algorithms for distributed online convex optimization

Liu

Zhao

Dong

2022

Asian Journal of Control

View full text Add to dashboard Cite

In this paper, we consider the distributed bandit convex optimization of time-varying objective functions over a network. By introducing perturbations into the objective functions, we design a deterministic difference and a randomized difference to replace the gradient information of the objective functions and propose two classes of gradient-free distributed algorithms. We prove that both the two classes of algorithms achieve regrets of O(T 3∕4 ) for convex objective functions and O(T 2∕3 ) for strongly convex objective functions, with respect to the time index T and consensus of the estimates established as well. Simulation examples are given justifying the theoretical results.

show abstract

Online Prediction with History‐Dependent Experts: The General Case

Drenska

Calder

2022

Comm Pure Appl Math

View full text Add to dashboard Cite

We study the problem of prediction of binary sequences with expert advice in the online setting, which is a classic example of online machine learning. We interpret the binary sequence as the price history of a stock, and view the predictor as an investor, which converts the problem into a stock prediction problem. In this framework, an investor, who predicts the daily movements of a stock, and an adversarial market, who controls the stock, play against each other over turns. The investor combines the predictions of experts in order to make a decision about how much to invest at each turn, and aims to minimize their regret with respect to the best-performing expert at the end of the game. We consider the problem with history-dependent experts, in which each expert uses the previous days of history of the market in making their predictions. We prove that the value function for this game, rescaled appropriately, converges as at a rate of to the viscosity solution of a nonlinear degenerate elliptic PDE, which can be understood as the Hamilton-Jacobi-Issacs equation for the two-person game. As a result, we are able to deduce asymptotically optimal strategies for the investor. Our results extend those established by the first author and R.V. Kohn [14] for experts and days of history.

show abstract

How to use expert advice

Cited by 494 publications

References 40 publications

On Bayes Methods for On-Line Boolean Prediction

On Bayes Methods for On-Line Boolean Prediction

Gradient‐free algorithms for distributed online convex optimization

Online Prediction with History‐Dependent Experts: The General Case

Contact Info

Product

Resources

About