2020
DOI: 10.48550/arxiv.2004.00184
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A theory of independent mechanisms for extrapolation in generative models

Abstract: Deep generative models reproduce complex empirical data but cannot extrapolate to novel environments. An intuitive idea to promote extrapolation capabilities is to enforce the architecture to have the modular structure of a causal graphical model, where one can intervene on each module independently of the others in the graph. We develop a framework to formalize this intuition, using the principle of Independent Causal Mechanisms, and show how over-parameterization of generative neural networks can hinder extr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…Extensive results in ZSL and OSR show that our method indeed improves the balance and hence achieves the state-ofthe-art performance. As future direction, we will seek new definitions on disentanglement [58] and devise practical implementations to achieve improved disentanglement [9]. A.1.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Extensive results in ZSL and OSR show that our method indeed improves the balance and hence achieves the state-ofthe-art performance. As future direction, we will seek new definitions on disentanglement [58] and devise practical implementations to achieve improved disentanglement [9]. A.1.…”
Section: Discussionmentioning
confidence: 99%
“…For example, recent GAN models [74,26] can generate a large variety of photorealistic images by only using Z. In fact, as shown in recent literature [9], it is possible for an over-parameterized model P θ (X|Z, Y ) to ignore Y and use purely Z to generate X, leading to non-faithful generations. This is because the information in Y might be fully contained in Z.…”
Section: Counterfactual-faithful Trainingmentioning
confidence: 99%
See 1 more Smart Citation
“…Based on group theory and symmetry transformations, Higgins et al (2018) provides the "first principled definition of a disentangled representation". Closely related to this concept is also the field of causality in machine learning (Schölkopf, 2019;Suter et al, 2019), more specifically the search for causal generative models Besserve et al (2018Besserve et al ( , 2020. In terms of implementable metrics, a variety of quantities have been introduced, such as the β-VAE score Higgins et al (2017), SAP score Kumar et al (2017), DCI scores Eastwood and Williams (2018) and the Mutual Information Gap (MIG, Chen et al (2018)).…”
Section: Related Workmentioning
confidence: 99%
“…Note that the given "fact" is Z = z(x) and the To the best of our knowledge, the proposed counterfactual framework is the first to provide a theoretical ground for balancing and improving the seen/unseen classification. In particular, we show that the quality of disentangling Z and Y is the key bottleneck, so it is a potential future direction for ZSL/OSR [38,167,231,311]. Our method can serve as an unseen/seen binary classifier, which can plug-and-play and boost existing ZSL/OSR methods to achieve new state-of-the-arts (Section 5.2.6).…”
Section: Introductionmentioning
confidence: 96%