2020
DOI: 10.48550/arxiv.2002.10286
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Prediction with Corrupted Expert Advice

Abstract: We revisit the fundamental problem of prediction with expert advice, in a setting where the environment is benign and generates losses stochastically, but the feedback observed by the learner is subject to a moderate adversarial corruption. We prove that a variant of the classical Multiplicative Weights algorithm with decreasing step sizes achieves constant regret in this setting and performs optimally in a wide range of environments, regardless of the magnitude of the injected corruption. Our results reveal a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 10 publications
0
8
0
Order By: Relevance
“…This gap was partially bridged by Zimmert and Seldin (2019), who proved that online mirror descent with Tsallis-INF regularizer can achieve the optimal O(K ln T + C) bound in expectation, provided that the optimal arm is unique. Recently, the adversariallycorrupted stochastic reward model is extended to prediction with expert advice (Amir et al 2020), assortment optimization (Chen, Krishnamurthy, and Wang 2019), Gaussian bandits (Bogunovic, Krause, and Scarlett 2020), linear bandits (Kapoor, Patel, and Kar 2019;Li, Lou, and Shan 2019), and reinforcement learning (Lykouris et al 2019). Instead of studying the budget-bounded corruption setting, several papers focus on the scenario where the rewards are corrupted with a fixed probability (Altschuler, Brunel, and Malek 2019;Kapoor, Patel, and Kar 2019;Guan et al 2020).…”
Section: Stochastic Learning With Adversarial Corruptionsmentioning
confidence: 99%
See 2 more Smart Citations
“…This gap was partially bridged by Zimmert and Seldin (2019), who proved that online mirror descent with Tsallis-INF regularizer can achieve the optimal O(K ln T + C) bound in expectation, provided that the optimal arm is unique. Recently, the adversariallycorrupted stochastic reward model is extended to prediction with expert advice (Amir et al 2020), assortment optimization (Chen, Krishnamurthy, and Wang 2019), Gaussian bandits (Bogunovic, Krause, and Scarlett 2020), linear bandits (Kapoor, Patel, and Kar 2019;Li, Lou, and Shan 2019), and reinforcement learning (Lykouris et al 2019). Instead of studying the budget-bounded corruption setting, several papers focus on the scenario where the rewards are corrupted with a fixed probability (Altschuler, Brunel, and Malek 2019;Kapoor, Patel, and Kar 2019;Guan et al 2020).…”
Section: Stochastic Learning With Adversarial Corruptionsmentioning
confidence: 99%
“…Thus, a natural question arises: Is there a bandits model that lies between the stochastic and adversary worlds and admits regret guarantees only slightly worsen than the logarithmic regret bound? In fact, this question has been answered affirmatively in the context of MAB (Seldin and Slivkins 2014;Lykouris, Mirrokni, and Paes Leme 2018;Zimmert and Seldin 2019;Gupta, Koren, and Talwar 2019) and PEA (Amir et al 2020), but still remains open for GB.…”
mentioning
confidence: 98%
See 1 more Smart Citation
“…The fact that the adversary can only interfere on the outcome of the prediction of one expert is the main difference between our framework and the classical prediction with expert advice problems in [2,10,11,15,18], and also the bandit problems with corruption such as [19] and [25] where the regret bounds provided depend on the corruption. However, unlike [3] and [21], the adversary can optimally control the level of corruption at each round and therefore the level of corruption might be unbounded.…”
Section: Problem Formulationmentioning
confidence: 99%
“…Similar to [3,8,19,21,25], in this paper, we bridge the two cases by considering an adversary who cannot freely choose the outcomes. In our framework, the gains of the experts are drawn from a fixed distribution.…”
Section: Introductionmentioning
confidence: 99%