We study the subject of universal online learning with non-i.i.d. processes for bounded losses. The notion of an universally consistent learning was defined by Hanneke [14] in an effort to study learning theory under minimal assumptions, where the objective is to obtain low long-run average loss for any target function. We are interested in characterizing processes for which learning is possible and whether there exist learning rules guaranteed to be universally consistent given the only assumption that such learning is possible. The case of unbounded losses is very restrictive, since the learnable processes almost surely visit a finite number of points and as a result, simple memorization is optimistically universal [14; 3]. We focus on the bounded setting and give a complete characterization of the processes admitting strong and weak universal learning. We further show that k-nearest neighbor algorithm (kNN) is not optimistically universal and present a novel variant of 1NN which is optimistically universal for general input and value spaces in both strong and weak setting. This closes all the COLT 2021 open problems posed in [15] on universal online learning.
We provide algorithms for regression with adversarial responses under large classes of non-i.i.d. instance sequences, on general separable metric spaces, with provably minimal assumptions. We also give characterizations of learnability in this regression context. We consider universal consistency which asks for strong consistency of a learner without restrictions on the value responses. Our analysis shows that such objective is achievable for a significantly larger class of instance sequences than stationary processes, and unveils a fundamental dichotomy between value spaces: whether finite-horizon mean-estimation is achievable or not. We further provide optimistically universal learning rules, i.e., such that if they fail to achieve universal consistency, any other algorithm will fail as well. For unbounded losses, we propose a mild integrability condition under which there exist algorithms for adversarial regression under large classes of non-i.i.d. instance sequences. In addition, our analysis also provides a learning rule for meanestimation in general metric spaces that is consistent under adversarial responses without any moment conditions on the sequence, a result of independent interest.
We study universal consistency of non-i.i.d. processes in the context of online learning. A stochastic process is said to admit universal consistency if there exists a learner that achieves vanishing average loss for any measurable response function on this process. When the loss function is unbounded, [1] showed that the only processes admitting strong universal consistency are those taking a finite number of values almost surely. However, when the loss function is bounded, the class of processes admitting strong universal consistency is much richer and its characterization could be dependent on the response setting [2]. In this paper, we show that this class of processes is independent from the response setting thereby closing an open question of [3] (Open Problem 3). Specifically, we show that the class of processes that admit universal online learning is the same for binary classification as for multiclass classification with countable number of classes. Consequently, any output setting with bounded loss can be reduced to binary classification. Our reduction is constructive and practical. Indeed, we show that the nearest neighbor algorithm is transported by our construction. For binary classification on a process admitting strong universal learning, we prove that nearest neighbor successfully learns at least all finite unions of intervals.
Motivated by the problem of bounding the number of iterations of the Simplex algorithm we investigate the possible lengths of monotone paths followed inside the oriented graphs of polyhedra (oriented by the objective function). We consider both the shortest and the longest monotone paths and estimate the monotone diameter and height of polyhedra. Our analysis applies to transportation polytopes, matroid polytopes, matching polytopes, shortest-path polytopes, and the TSP, among others.We begin by showing that combinatorial cubes have monotone diameter and Bland simplex height upper bounded by their dimension and that in fact all monotone paths of zonotopes are no larger than the number of edge directions of the zonotope. We later use this to show that several polytopes have polynomial-size monotone diameter. In contrast, we show that for many well-known combinatorial polytopes, the height is at least exponential. Surprisingly, for some famous pivot rules, e.g., greatest improvement and steepest edge, these same polytopes have polynomial-size simplex paths.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.