Intervention and Identifiability in Latent Variable Modelling

Romeijn, Jan-Willem; Williamson, Jon

doi:10.1007/s11023-018-9460-y

Cited by 8 publications

(7 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In learning rate regime, we used batch size 64 while using the learning rates [0.003, 0.002, 0.001, 0.0003, 0.0001, 0.00003, 0.00001]. In batch size regime, we used learning rate 0.0001 and batch sizes [8,16,32,64,128,256,512]. Cross entropy loss was used, with ADAM optimizer (β 1 = 0.9, β 2 = 0.999, = 1e − 08).…”

Section: Trainingmentioning

confidence: 99%

“…Many prior works approach the phenomenon via elimination or reduction to uncertainty. The elimination of the problem can be defended if one can plausibly unearth the underlying causal properties via designing a set of interventions to that end (see D'Amour et al [1] for a large scale empirical evaluation, Romeijn and Williamson [8] for theoretical analysis). Yet, this rarely happens in ML modeling except in the very theory of causal inference itself (e.g.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Disentangling Model Multiplicity in Deep Learning

Heljakka¹,

Trapp²,

Kannala³

et al. 2022

Preprint

View full text Add to dashboard Cite

It is prevalent and well-observed, but poorly understood, that two machine learning models with similar performance during training can have very different real-world performance characteristics. This implies elusive differences in the internals of the models, manifesting as representational multiplicity (RM). We introduce a conceptual and experimental setup for analyzing RM and show that certain training methods systematically result in greater RM than others, measured by activation similarity via singular vector canonical correlation analysis (SVCCA). We further correlate it with predictive multiplicity measured by the variance in i.i.d. and outof-distribution test set predictions, in four common image data sets. We call for systematic measurement and maximal exposure, not elimination, of RM in models. Qualitative tools such as our confabulator analysis can facilitate understanding and communication of RM effects to stakeholders. * Contributed all theory. Work done while in Aalto University.Preprint. Under review.

show abstract

Section: Trainingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Disentangling Model Multiplicity in Deep Learning

Heljakka¹,

Trapp²,

Kannala³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…These are supported by the framework of generative model ; Khemakhem, Monti, Kingma and Hyvärinen (2020); Kingma and Welling (2014); Suter et al (2019) which has natural connection with the causal graph Schölkopf (2019) that the edge in the causal graph reflects both the causal effect and also the generating process. Until now, perhaps the most similar work to us are Romeijn and Williamson (2018) and Teshima et al (2020) which also need multiple training domains and get access to a few samples in the target domain. Both work assumes the similar causal graph with us but unlike our LaCIM, they do not separate the latent factors which can not explain the spurious correlation learned by supervised learning Ilse et al (2020).…”

Section: Comparisons With Existing Work In Domain Adaptationmentioning

confidence: 99%

“…Both work assumes the similar causal graph with us but unlike our LaCIM, they do not separate the latent factors which can not explain the spurious correlation learned by supervised learning Ilse et al (2020). Besides, the multiple training datasets in Romeijn and Williamson (2018) refer to intervened data which may hard to obtain in some applications. We have verified in our experiments that explicitly disentangle the latent variables into two parts can result in better OOD prediction power than mixing them together.…”

Section: Comparisons With Existing Work In Domain Adaptationmentioning

confidence: 99%

Latent Causal Invariant Model

Sun,

Wu,

Zheng

et al. 2020

Preprint

View full text Add to dashboard Cite

Current supervised learning can learn spurious correlation during the data-fitting process, imposing issues regarding interpretability, out-of-distribution (OOD) generalization, and robustness. To avoid spurious correlation, we propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction. Specifically, we introduce latent variables that are separated into (a) output-causative factors and (b) others that are spuriously correlated to the output via confounders, to model the underlying causal factors. We further assume the generating mechanisms from latent space to observed data to be causally invariant. We give the identifiable claim of such invariance, particularly the disentanglement of output-causative factors from others, as a theoretical guarantee for precise inference and avoiding spurious correlation. We propose a Variational-Bayesian-based method for estimation and to optimize over the latent space for prediction. The utility of our approach is verified by improved interpretability, prediction power on various OOD scenarios (including healthcare) and robustness on security.

show abstract

“…Several conceptual concerns are considered in the literature, including as examples questions of causality (Cliff, 1983), or the relevance or otherwise of correlations and covariances calculated on betweensubjects data to any single individual (Borsboom, 2005;Weinberger, 2015). Further problems that remain for applications of the latent variable model that are not dealt with in this paper include decisions regarding calculation techniques for the parameters of the latent variable model, measurement invariance (Meredith, 1993), unidimensionality (McDonald, 1999, model identification (Romeijn & Williamson, 2018), and the problem of equivalent models (Maccallum et al, 1993).…”

Section: Model Assumptions and Problemsmentioning

confidence: 99%

Towards a Logical Framework for Latent Variable Modelling

Nowland¹,

Beath²,

Boag³

2018

EasyChair Preprints

View full text Add to dashboard Cite

Modelling in psychometrics has become increasingly reliant on computer software; at the same time, many decisions that a researcher makes remain unrecorded and perhaps, unreconciled to anything more than the researcher's intuition or best guess. The aim of this paper is to set out a logic that accounts for and guides decision procedures, in psychometric research practices. Such a logic is informed by the integration of three systematic viewpoints:

show abstract

Intervention and Identifiability in Latent Variable Modelling

Cited by 8 publications

References 21 publications

Disentangling Model Multiplicity in Deep Learning

Disentangling Model Multiplicity in Deep Learning

Latent Causal Invariant Model

Towards a Logical Framework for Latent Variable Modelling

Contact Info

Product

Resources

About