2020
DOI: 10.48550/arxiv.2003.06016
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Invariant Causal Prediction for Block MDPs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(17 citation statements)
references
References 10 publications
0
17
0
Order By: Relevance
“…For instance, it is reasonable that multitask learning (Caruana, 1997) has been used successfully across all applications of machine learning, because multitask data provide multiple views of their shared features (Nature Variables), making inferring them more accurate as suggested in MUA. Another example is that in reinforcement learning, one widely leveraged multiview data to discover the invariant part of states (Lu et al, 2018;Zhang et al, 2020).…”
Section: Implications Of a Unifying Viewmentioning
confidence: 99%
“…For instance, it is reasonable that multitask learning (Caruana, 1997) has been used successfully across all applications of machine learning, because multitask data provide multiple views of their shared features (Nature Variables), making inferring them more accurate as suggested in MUA. Another example is that in reinforcement learning, one widely leveraged multiview data to discover the invariant part of states (Lu et al, 2018;Zhang et al, 2020).…”
Section: Implications Of a Unifying Viewmentioning
confidence: 99%
“…It not only aims to address the challenging OOD prediction problem, but also is a pioneer work that guides causal machine learning research towards the development of inductive bias imposing causal constraints. The effectiveness of IRM and its variants have been demonstrated across various areas including computer vision [1], natural language processing [5], CTR prediction [28], reinforcement learning [29] and financial forecasting [14]. [2] propose the original formulation of IRM as a two-stage optimization problem:…”
Section: Preliminaries and Irmmentioning
confidence: 99%
“…Finally, the IRM method has recently been applied to RL problems. In [24], the authors attempt to learn a causal Markov decision process (MDP) that is bisimilar to the full MDP present during training. This formulation requires learning a model for both the causal and full dynamics of the system, a mapping between the two, and a causal model of the rewards.…”
Section: Related Workmentioning
confidence: 99%