Smoothed Online Learning is as Easy as Statistical Learning

Block, Adam; Dagan, Yuval; Golowich, Noah; Rakhlin, Alexander

doi:10.48550/arxiv.2202.04690

Cited by 4 publications

(20 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we provide basic definitions and setup the learning problem. We begin by defining a smooth distribution, as in Block et al [2022], Haghtalab et al [2021]: Definition 1. Let µ be a probability measure on a measurable space X.…”

Section: Preliminariesmentioning

confidence: 99%

“…To circumvent the pessimism of the sequential setting, recent works [Rakhlin et al, 2011, Haghtalab et al, 2020, Block et al, 2022, Haghtalab et al, 2022 have studied the smoothed sequential learning paradigm, where the adversary is constrained to choose x t at random from any probability distribution p t with density at most 1/σ with respect to a known measure µ. The most current of these results point to a striking statistical computational gap: whereas there exist algorithms which attain regret that scales with T log(/σ), computationally efficient algorithms can only hope for poly(T /σ) regret in general, even against a realizable adversary [Haghtalab et al, 2022, Theorem 5.2].…”

Section: Introductionmentioning

confidence: 99%

“…Finally, we present a complementary approach based on the perceptron algorithm which is robust to adversarial corruptions of the labels y t , and enjoys a polynomial regret in a "directional smoothness" parameter which interpolates between the log(1/σ)-guarantees attained above in the realizable setting, and the poly(1/σ) bounds from prior work. We emphasize that, though we adopt the smoothed online learning setting of Rakhlin et al [2011], Haghtalab et al [2021], Block et al [2022], we use entirely different techniques involving Ville's inequality [Ville, 1939], geometric measure theory, and convex geometry. Moreover, in none of these works was the question of adapting to realizability explored; thus, we provide the first regret bounds that are logarithmic in both the horizon and the smoothness parameter.…”

Section: Introductionmentioning

confidence: 99%

“…Recently, Block et al [2022], Haghtalab et al [2022] generalized Haghtalab et al [2021] to allow for continuous labels and, more importantly, provided oracle-efficient algorithms for achieving vanishing regret in the smoothed setting. These papers also showed that the dependence on σ in the regret bounds of their oracle-efficient algorithms, which was polynomial, could not in general be reduced to the logarithmic dependence achievable by the inefficient algorithms, thereby exposing a statistical-computational gap.…”

Section: Introductionmentioning

confidence: 99%

“…These papers also showed that the dependence on σ in the regret bounds of their oracle-efficient algorithms, which was polynomial, could not in general be reduced to the logarithmic dependence achievable by the inefficient algorithms, thereby exposing a statistical-computational gap. Unlike other recent works such as Block et al [2022], Haghtalab et al [2022], we do not use the coupling approach [Haghtalab et al, 2021] to prove our regret bounds.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions

Block¹,

Simchowitz²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

Due to the drastic gap in complexity between sequential and batch statistical learning, recent work has studied a smoothed sequential learning setting, where Nature is constrained to select contexts with density bounded by 1/σ with respect to a known measure µ. Unfortunately, for some function classes, there is an exponential gap between the statistically optimal regret and that which can be achieved efficiently. In this paper, we give a computationally efficient algorithm that is the first to enjoy the statistically optimal log(T /σ) regret for realizable K-wise linear classification. We extend our results to settings where the true classifier is linear in an over-parameterized polynomial featurization of the contexts, as well as to a realizable piecewise-regression setting assuming access to an appropriate ERM oracle. Somewhat surprisingly, standard disagreement-based analyses are insufficient to achieve regret logarithmic in 1/σ. Instead, we develop a novel characterization of the geometry of the disagreement region induced by generalized linear classifiers. Along the way, we develop numerous technical tools of independent interest, including a general anti-concentration bound for the determinant of certain matrix averages.

show abstract

Section: Preliminariesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions

Block¹,

Simchowitz²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

Sequential vs. Fixed Design Regrets in Online Learning

Wu¹,

Heidari²,

Grama³

et al. 2022

2022 IEEE International Symposium on Information Theory (ISIT)

View full text Add to dashboard Cite

We study the problem of online learning and online regret minimization when samples are drawn from a general unknown non-stationary process. We introduce the concept of a dynamic changing process with cost K, where the conditional marginals of the process can vary arbitrarily, but that the number of different conditional marginals is bounded by K over T rounds. For such processes we prove a tight (upto √ log T factor) bound O( KT • VC(H) log T ) for the expected worst case regret of any finite VC-dimensional class H under absolute loss (i.e., the expected miss-classification loss). We then improve this bound for general mixable losses, by establishing a tight (up to log 3 T factor) regret bound O(K • VC(H) log 3 T ). We extend these results to general smooth adversary processes with unknown reference measure by showing a sub-linear regret bound for 1-dimensional threshold functions under a general bounded convex loss. Our results can be viewed as a first step towards regret analysis with non-stationary samples in the distribution blind (universal) regime. This also brings a new viewpoint that shifts the study of complexity of the hypothesis classes to the study of the complexity of processes generating data.

show abstract

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

Yan¹,

Luo²,

Chen³

2022

Preprint

View full text Add to dashboard Cite

We consider regret minimization for Adversarial Markov Decision Processes (AMDPs), where the loss functions are changing over time and adversarially chosen, and the learner only observes the losses for the visited state-action pairs (i.e., bandit feedback). While there has been a surge of studies on this problem using Online-Mirror-Descent (OMD) methods, very little is known about the Followthe-Perturbed-Leader (FTPL) methods, which are usually computationally more efficient and also easier to implement since it only requires solving an offline planning problem. Motivated by this, we take a closer look at FTPL for learning AMDPs, starting from the standard episodic finite-horizon setting. We find some unique and intriguing difficulties in the analysis and propose a workaround to eventually show that FTPL is also able to achieve near-optimal regret bounds in this case. More importantly, we then find two significant applications: First, the analysis of FTPL turns out to be readily generalizable to delayed bandit feedback with order-optimal regret, while OMD methods exhibit extra difficulties (Jin et al., 2022). Second, using FTPL, we also develop the first no-regret algorithm for learning communicating AMDPs in the infinite-horizon setting with bandit feedback and stochastic transitions. Our algorithm is efficient assuming access to an offline planning oracle, while even for the easier full-information setting, the only existing algorithm (Chandrasekaran and Tewari, 2021) is computationally inefficient.Preprint. Under review.

show abstract

Smoothed Online Learning is as Easy as Statistical Learning

Cited by 4 publications

References 24 publications

Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions

Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions

Sequential vs. Fixed Design Regrets in Online Learning

Follow-the-Perturbed-Leader for Adversarial Markov Decision Processes with Bandit Feedback

Contact Info

Product

Resources

About