2022
DOI: 10.48550/arxiv.2203.16437
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Weakly supervised causal representation learning

Abstract: Learning high-level causal representations together with a causal model from unstructured low-level data such as pixels is impossible from observational data alone. We prove under mild assumptions that this representation is identifiable in a weakly supervised setting. This requires a dataset with paired samples before and after random, unknown interventions, but no further labels. Finally, we show that we can infer the representation and causal graph reliably in a simple synthetic domain using a variational a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 11 publications
0
7
0
Order By: Relevance
“…ICLM learns by maximising an ELBO approximation to the maximum likelihood, and a noise decoder p(x | ) generates a sample from the inferred exogenous variables. Brehmer et al [100] reproduce interventional distributions X I ∼ p(x | do(a)) as demonstrated in Fig. 4.5.…”
Section: Weakly Supervised Causal Disentanglementmentioning
confidence: 66%
See 1 more Smart Citation
“…ICLM learns by maximising an ELBO approximation to the maximum likelihood, and a noise decoder p(x | ) generates a sample from the inferred exogenous variables. Brehmer et al [100] reproduce interventional distributions X I ∼ p(x | do(a)) as demonstrated in Fig. 4.5.…”
Section: Weakly Supervised Causal Disentanglementmentioning
confidence: 66%
“…Brehmer et al [100] improve upon Ada-GVAE by removing the requirement that a practitioner has access to the number of intervened variables between sample pairs, thus propose to learn a generative model over a collection of tuples {(x i , x I )} n i=1 . In this setting, they propose two types of models: Explicit or Implicit Latent Causal Models.…”
Section: Weakly Supervised Causal Disentanglementmentioning
confidence: 99%
“…Lippe et al exploit causal interventions on the latents to provide identification guarantees but require the knowledge of intervention targets and assume an invariant causal model describing the relations between any adjacent time frames. In concurrent work, Brehmer et al (2022) leverage data generated under causal interventions as a source of weak supervision and prove identification for structural causal models that are diffeomorphic transforms of exogenous noise. In addition to the above, there are a number of recent papers that explain the success of self-supervised contrastive learning through the lens of identification of representations.…”
Section: Related Workmentioning
confidence: 99%
“…More recently, Khemakhem et al (2020a) proved a major breakthrough by showing that given side information u, identifiability of the entire generative model is possible up to certain (nonlinear) equivalences. Since this pathbreaking work, many generalizations have been proposed (Hälvä and Hyvarinen, 2020;Hälvä et al, 2021;Khemakhem et al, 2020b;Li et al, 2019;Mita et al, 2021;Sorrenson et al, 2019;Yang et al, 2021;Klindt et al, 2020;Brehmer et al, 2022), all of which require some form of auxiliary information. Other approaches to identifiability include various forms of weak supervision such as contrastive learning (Zimmermann et al, 2021), group-based disentanglement (Locatello et al, 2020), and independent mechanisms (Gresele et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…This contrasts a recent line of work that has established fundamental new results regarding the identifiability of VAEs that requires conditioning on an auxiliary variable u that renders each latent dimension conditionally independent (Khemakhem et al, 2020a). While this result has been generalized and relaxed in several directions (Hälvä and Hyvarinen, 2020;Hälvä et al, 2021;Khemakhem et al, 2020b;Li et al, 2019;Mita et al, 2021;Sorrenson et al, 2019;Yang et al, 2021;Klindt et al, 2020;Brehmer et al, 2022), fundamentally these results still crucially rely on the side information u. We show that this is in fact unnecessary-confirming existing empirical studies (e.g Willetts and Paige, 2021;Falck et al, 2021)-and do so without sacrificing any representational capacity.…”
Section: Introductionmentioning
confidence: 98%