2021
DOI: 10.48550/arxiv.2110.04442
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Learning of Potential Outcomes

Abstract: This review systematizes the emerging literature for causal inference using deep neural networks under the potential outcomes framework. It provides an intuitive introduction on how deep learning can be used to estimate/predict heterogeneous treatment effects and extend causal inference to settings where confounding is non-linear, time varying, or encoded in text, networks, and images. To maximize accessibility, we also introduce prerequisite concepts from causal inference and deep learning. The survey differs… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 22 publications
0
10
0
Order By: Relevance
“…This review work complements two excellent and extensive overviews of causal inference methods by Yao et al [4] and Koch et al [21]. The former focused on categorizing existing causal inference methods.…”
Section: Related Workmentioning
confidence: 85%
“…This review work complements two excellent and extensive overviews of causal inference methods by Yao et al [4] and Koch et al [21]. The former focused on categorizing existing causal inference methods.…”
Section: Related Workmentioning
confidence: 85%
“…The simulation we have set up for assessing predictive performance shows promise but also reveals limitations that are discussed further below. However, in a stringent machine learning perspective, one might also consider adopting deep neural networks as causal estimators, a novel and vibrant research area [ 97 , 98 ]. Preliminary studies in this field claim, for low bias, suitability for estimating the heterogeneous treatment effects and providing opportunities to predict causal effects in untreated populations beyond the original sample.…”
Section: Discussionmentioning
confidence: 99%
“…Preliminary studies in this field claim, for low bias, suitability for estimating the heterogeneous treatment effects and providing opportunities to predict causal effects in untreated populations beyond the original sample. However, there is still little consensus on important practical considerations needed to deploy these tools in the wild [ 97 ]. For what concerns causal inference for affective/social behaviour, the long-term most promising avenue is offered by deep learning of the causal structure of dynamic systems and time series data [ 99 ].…”
Section: Discussionmentioning
confidence: 99%
“…Causal prediction through DL algorithms has been recently drawing much attention. Koch et al (2021) reviewed four different approaches to deep causal estimation including (i) meta-learners (Künzel et al, 2019) like S-learners, T-leaners, and X-learners, (ii) balancing through representation learning such as TARNet (Shalit et al, 2017) and CFRNet (Johansson et al, 2018(Johansson et al, , 2020, (iii) extension with inverse propensity score weighting such as targeted maximum likelihood estimation (TMLE, Van der Laan et al (2011)) and Dragonnet and targeted regularization (Shi et al, 2019), and (iv) adversarial training of generative models such as GANITE (Yoon et al, 2018). In particular, TARNet utilizes the loss functions that minimize the mean square errors (MSE) of the observed vs. predicted causal outcomes after representation learning, i.e., after deconfounding the treatment from the outcome by forcing the treated and control covariate distributions to get closer together, and CFRNet additionally minimizes the distance between treatment indicator and covariate distributions.…”
Section: Related Work and Contributionmentioning
confidence: 99%
“…To calculate the ATE, the TMLE is a two-step procedure, the first step being an input-output modeling without adjusting for the inverse propensity score weights and the second step being the estimation step for the fluctuation parameter i as a constant regression coefficient for the slope between the inverse propensity score weights and the observed outcome.It was noted (Koch et al, 2021) that this would be equivalent to minimizing the weighted term involving i in equation ( 5).…”
Section: No Censoring (Nc) Casesmentioning
confidence: 99%