Understanding Estimation and Generalization Error of Generative Adversarial Networks

Ji, Kaiyi; Zhou, Yi; Liang, Y. T.

doi:10.1109/tit.2021.3053234

Cited by 10 publications

(18 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…2) Generalization of GANs (reference [64]): This work investigates the estima- The experiments validate our theoretical results.…”

Section: Other Phd Researchmentioning

confidence: 63%

See 1 more Smart Citation

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Bilevel optimization has become a powerful framework in a variety of machine learning applications including signal processing, meta-learning, hyperparameter optimization, reinforcement learning and network architecture search. There are generally two classes of bilevel optimization formulations for modern machine learning: 1) problem-based bilevel optimization, whose inner-level problem is formulated as finding a minimizer of a given loss function; and 2) algorithm-based bilevel optimization, whose inner-level solution is an output of a fixed algorithm. For the first problem class, two popular types of gradient-based algorithms have been proposed to estimate the gradient of the outer-level objective (hypergradient) via approximate implicit differentiation (AID) and iterative differentiation (ITD). Algorithms for the second problem class include the popular model-agnostic meta-learning (MAML) and almost no inner loop (ANIL). Although bilevel optimization algorithms have been widely used, their convergence rate and fundamental limitations have not been well explored.

show abstract

“…2) Generalization of GANs (reference [64]): This work investigates the estima- The experiments validate our theoretical results.…”

Section: Other Phd Researchmentioning

confidence: 63%

“…To provide a neat version of thesis with closely correlated topics, this thesis does not include all of the author's works. We briefly talk about some representatives of the author's other research works [56,64,59,123,61,60,125,101,116,47,130,129,133] as follows.…”

Section: Other Phd Researchmentioning

confidence: 99%

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…We now consider a setting where we have a limited number of training samples 1 S x = {X 1 ,...,X n } and S z = {Z 1 ,...,Z m } from P r and P Z , respectively. Also, the discriminator and generator classes are typically neural networks; these limitations lead to estimation errors in training GANs [6], [25], [26]. While [26] models the interplay between both the discriminator and generator in the estimation error bounds, those developed in [6], [25] do not explicitly capture the role of the generator.…”

Section: Estimation Error Bounds For Cpe Loss Based Ganmentioning

confidence: 99%

“…Also, the discriminator and generator classes are typically neural networks; these limitations lead to estimation errors in training GANs [6], [25], [26]. While [26] models the interplay between both the discriminator and generator in the estimation error bounds, those developed in [6], [25] do not explicitly capture the role of the generator. We adopt the approach in [26]; to this end, we begin with the notion of neural net (nn) distance (first introduced in [24]) as defined for the setup in [26], [27]:…”

Section: Estimation Error Bounds For Cpe Loss Based Ganmentioning

confidence: 99%

“…where, w k is a parameter vector of the output layer; for i ∈ [1 : k − 1] and j ∈ [1 : l], W i and V j are parameter matrices; r i (•) and s j (•) are entry-wise activation functions of layers i and j, i.e., for a ∈ R t , r i (a) = [r i (a 1 ),...,r i (a t )] and s i (a) = [s i (a 1 ),...,s i (a t )]; and σ(•) is the sigmoid function given by σ(p) = 1/(1 + e −p ) (note that σ does not appear in the discriminator in [26,Equation (7)] as the discriminator considered in the neural net distance is not a soft classifier mapping to [0,1]). We assume that each r i (•) and s j (•) are R iand S j -Lipschitz, respectively, and also that they are positive homogeneous, i.e., r i (λp) = λr i (p) and s j (λp) = λs j (p), for any λ ≥ 0 and p ∈ R. Finally, as modelled in [26], [28]- [30], we assume that the Frobenius norms of the parameter matrices are bounded, i.e.,…”

Section: Estimation Error Bounds For Cpe Loss Based Ganmentioning

confidence: 99%

See 1 more Smart Citation

$α$-GAN: Convergence and Estimation Guarantees

Kurri¹,

Monica²,

Sypherd³

et al. 2022

Preprint

View full text Add to dashboard Cite

We prove a two-way correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated f -divergences. We then focus on α-GAN, defined via the α-loss, which interpolates several GANs (Hellinger, vanilla, Total Variation) and corresponds to the minimization of the Arimoto divergence. We show that the Arimoto divergences induced by α-GAN equivalently converge, for all α ∈ R>0∪{∞}. However, under restricted learning models and finite samples, we provide estimation bounds which indicate diverse GAN behavior as a function of α. Finally, we present empirical results on a toy dataset that highlight the practical utility of tuning the α hyperparameter.

show abstract

Satellite wave 2D spectrum partition based on the PI-vit-GAN(physically-informed ViT-GAN) method

Lv,

Tao,

et al. 2024

Coastal Engineering

View full text Add to dashboard Cite

Understanding Estimation and Generalization Error of Generative Adversarial Networks

Cited by 10 publications

References 3 publications

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

$α$-GAN: Convergence and Estimation Guarantees

Satellite wave 2D spectrum partition based on the PI-vit-GAN(physically-informed ViT-GAN) method

Contact Info

Product

Resources

About