Hédi Hadiji scite author profile

Hédi Hadiji

5Publications

17Citation Statements Received

101Citation Statements Given

How they've been cited

How they cite others

Affiliations

Département de Mathématiques

Publications

Order By: Most citations

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

Garivier¹,

Hadiji²,

Ménard³

et al. 2018

Preprint

View full text Add to dashboard Cite

In the context of K-armed stochastic bandits with distribution only assumed to be supported by [0, 1], we introduce the first algorithm, called KL-UCB-switch, that enjoys simultaneously a distribution-free regret bound of optimal order √ KT and a distribution-dependent regret bound of optimal order as well, that is, matching the κ ln T lower bound by Lai and Robbins [1985] and Burnetas and Katehakis [1996]. This self-contained contribution simultaneously presents state-ofthe-art techniques for regret minimization in bandit models, and an elementary construction of nonasymptotic confidence bounds based on the empirical likelihood method for bounded distributions.

show abstract

Between Stochastic and Adversarial Online Convex Optimization: Improved Regret Bounds via Smoothness

Sarah¹,

Hadiji²,

Erven³

et al. 2022

Preprint

View full text Add to dashboard Cite

Stochastic and adversarial data are two widely studied settings in online learning. But many optimization tasks are neither i.i.d. nor fully adversarial, which makes it of fundamental interest to get a better theoretical understanding of the world between these extremes. In this work we establish novel regret bounds for online convex optimization in a setting that interpolates between stochastic i.i.d. and fully adversarial losses. By exploiting smoothness of the expected losses, these bounds replace a dependence on the maximum gradient length by the variance of the gradients, which was previously known only for linear losses. In addition, they weaken the i.i.d. assumption by allowing adversarially poisoned rounds or shifts in the data distribution. To accomplish this goal, we introduce two key quantities associated with the loss sequence, that we call the cumulative stochastic variance and the adversarial variation. Our upper bounds are attained by instances of optimistic follow the regularized leader, and we design adaptive learning rates that automatically adapt to the cumulative stochastic variance and adversarial variation. In the fully i.i.d. case, our bounds match the rates one would expect from results in stochastic acceleration, and in the fully adversarial case they gracefully deteriorate to match the minimax regret. We further provide lower bounds showing that our regret upper bounds are tight for all intermediate regimes for the cumulative stochastic variance and the adversarial variation.

show abstract

Distributed Online Learning for Joint Regret with Communication Constraints

Hoeven¹,

Hadiji²,

Erven³

2021

Preprint

View full text Add to dashboard Cite

In this paper we consider a distributed online learning setting for joint regret with communication constraints. This is a multi-agent setting in which in each round t an adversary activates an agent, which has to issue a prediction. A subset of all the agents may then communicate a b-bit message to their neighbors in a graph. All agents cooperate to control the joint regret, which is the sum of the losses of the agents minus the losses evaluated at the best fixed common comparator parameters u. We provide a comparator-adaptive algorithm for this setting, which means that the joint regret scales with the norm of the comparator u . To address communication constraints we provide deterministic and stochastic gradient compression schemes and show that with these compression schemes our algorithm has worstcase optimal regret for the case that all agents communicate in every round. Additionally, we exploit the comparator-adaptive property of our algorithm to learn the best partition from a set of candidate partitions, which allows different subsets of agents to learn a different comparator.

show abstract

Adaptation to the Range in $K$-Armed Bandits

Hadiji¹,

Stoltz²

2020

Preprint

View full text Add to dashboard Cite

We consider stochastic bandit problems with K arms, each associated with a bounded distribution supported on the range [m, M ]. We do not assume that the range [m, M ] is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which, for instance, prevents from simultaneously achieving the typical ln T and √T bounds. For instance, a √ T distribution-free regret bound may only be achieved if the distribution-dependent regret bounds are at least of order √T . We exhibit a strategy achieving the rates for regret indicated by the new trade-off.Preprint. Under review.

show abstract

Scale-free Unconstrained Online Learning for Curved Losses

Mayo¹,

Hadiji²,

Erven³

2022

Preprint

View full text Add to dashboard Cite

A sequence of works in unconstrained online convex optimisation have investigated the possibility of adapting simultaneously to the norm U of the comparator and the maximum norm G of the gradients. In full generality, matching upper and lower bounds are known which show that this comes at the unavoidable cost of an additive GU 3 , which is not needed when either G or U is known in advance. Surprisingly, recent results by Kempka et al. (2019) show that no such price for adaptivity is needed in the specific case of 1-Lipschitz losses like the hinge loss. We follow up on this observation by showing that there is in fact never a price to pay for adaptivity if we specialise to any of the other common supervised online learning losses: our results cover log loss, (linear and non-parametric) logistic regression, square loss prediction, and (linear and non-parametric) least-squares regression. We also fill in several gaps in the literature by providing matching lower bounds with an explicit dependence on U . In all cases we obtain scale-free algorithms, which are suitably invariant under rescaling of the data. Our general goal is to establish achievable rates without concern for computational efficiency, but for linear logistic regression we also provide an adaptive method that is as efficient as the recent non-adaptive algorithm by Agarwal et al. (2021).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hédi Hadiji

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

Between Stochastic and Adversarial Online Convex Optimization: Improved Regret Bounds via Smoothness

Distributed Online Learning for Joint Regret with Communication Constraints

Adaptation to the Range in $K$-Armed Bandits

Scale-free Unconstrained Online Learning for Curved Losses

Contact Info

Product

Resources

About