2021
DOI: 10.48550/arxiv.2111.10106
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

Abstract: Individual Treatment Effect (ITE) prediction is an important area of research in machine learning which aims at explaining and estimating the causal impact of an action at the granular level. It represents a problem of growing interest in multiple sectors of application such as healthcare, online advertising or socioeconomics. To foster research on this topic we release a publicly available collection of 13.9 million samples collected from several randomized control trials, scaling up previously available data… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 19 publications
(32 reference statements)
0
6
0
Order By: Relevance
“…• CRITEO-UPLIFT v2 (Diemert et al 2021) is provided by the AdTech company Criteo. Data contains 13.9 million samples which are collected from several incremental A/B tests.…”
Section: Datasetsmentioning
confidence: 99%
“…• CRITEO-UPLIFT v2 (Diemert et al 2021) is provided by the AdTech company Criteo. Data contains 13.9 million samples which are collected from several incremental A/B tests.…”
Section: Datasetsmentioning
confidence: 99%
“…These include three aspects that affect the difficulty of use: benchmark results, hyperparameter search, and optimal parameters provided; two aspects of data preprocessing settings, instance deduplication and feature normalization; and one aspect that measures the degree of compatibility with DUM. As shown in Table 1, we evaluate a set of existing benchmarks and libraries for uplift modeling and causal inference that are powerfully relevant to our work against these considerations, including Criteo-ITE-Benchmark [13], CATENets [10], DoWhy [27], EconML [5], CausalML [7], DECI 1 , ShowWhy 2 . We find that none of them fully meet these requirements, which motivates us to propose a new public benchmark for DUM in this paper.…”
Section: Comparison With Existing Workmentioning
confidence: 99%
“…It provides data sets from two different sources, where the training set is collected from a production environment with a treatment bias, the treatment allocation is selective due to the operation target policy, and the test set is collected from the users who are not affected by the targeting strategy and the treatment assignment follows the randomized controlled trials (RCT). 2) Criteo [13] is a dataset from real advertising scenarios provided by Criteo AI Labs. It contains nearly 14 million instances with similar treatment bias, 12 continuous features, a treatment indicator, and 2 labels (i.e., visits and conversions).…”
Section: Datasetsmentioning
confidence: 99%
See 2 more Smart Citations