2021
DOI: 10.48550/arxiv.2101.09315
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tighter expected generalization error bounds via Wasserstein distance

Abstract: In this work, we introduce several expected generalization error bounds based on the Wasserstein distance. More precisely, we present full-dataset, single-letter, and random-subset bounds on both the standard setting and the randomized-subsample setting from Steinke and Zakynthinou [2020]. Moreover, we show that, when the loss function is bounded, these bounds recover from below (and thus are tighter than) current bounds based on the relative entropy and, for the standard setting, generate new, nonvacuous boun… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 11 publications
0
5
0
Order By: Relevance
“…This triple generalization, and the optimization point of view on Bayesian statistics is strongly advocated in [100] (in particular reasons to replace KL by D are given in this paper that seem to me more relevant in practice than Theorem 5.3). In this spirit, [153,47] provided PAC-Bayes type bounds where D is the Wasserstein distance. (6.4)…”
Section: Variational Approximationsmentioning
confidence: 99%
“…This triple generalization, and the optimization point of view on Bayesian statistics is strongly advocated in [100] (in particular reasons to replace KL by D are given in this paper that seem to me more relevant in practice than Theorem 5.3). In this spirit, [153,47] provided PAC-Bayes type bounds where D is the Wasserstein distance. (6.4)…”
Section: Variational Approximationsmentioning
confidence: 99%
“…Two examples are used to illustrate that the proposed approach can overcome some difficulties in applying the chaining mutual information approach. The roles that chaining can play in bounding generalization error on conjunction with other information-theoretic approach, such as the conditional mutual information [8], information density [14], and Wasserstein distance [15], as well as the possible application in noisy and stochastic learning algorithms, call for further research.…”
Section: Discussionmentioning
confidence: 99%
“…Hellstrom and Durisi [13,14] used information density to unify several existing bounds and bounding approaches. Similar bounds using other measures can be found in [15]. The information-theoretic bounds have been used to bound generalization errors in noisy and iterative learn algorithms [16,17,18,9,11].…”
Section: Related Workmentioning
confidence: 94%
“…(Bu et al, 2020a) provides tighter bounds by considering the individual sample mutual information, (Asadi et al, 2018;Asadi & Abbe, 2020) propose using chaining mutual information, and (Steinke & Zakynthinou, 2020;Hafez-Kolahi et al, 2020;Haghifam et al, 2020) advocate the conditioning and processing techniques. Information-theoretic generalization error bounds using other information quantities are also studied, such as, f -divergence (Jiao et al, 2017), α-Réyni divergence and maximal leakage (Issa et al, 2019;Esposito et al, 2019), Jensen-Shannon divergence (Aminian et al, 2020) and Wasserstein distance (Lopez & Jog, 2018;Wang et al, 2019;Rodríguez-Gálvez et al, 2021).…”
Section: Other Related Workmentioning
confidence: 99%