2017
DOI: 10.1002/cpa.21706
|View full text |Cite|
|
Sign up to set email alerts
|

Explanation of Variability and Removal of Confounding Factors from Data through Optimal Transport

Abstract: A methodology based on the theory of optimal transport is developed to attribute variability in data sets to known and unknown factors and to remove such attributable components of the variability from the data. Denoting by x the quantities of interest and by´the explanatory factors, the procedure transforms x into filtered variables y through a´-dependent map, so that the conditional probability distributions .xj´/ are pushed forward into a target distribution .y/, independent of´. Among all maps and target d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
38
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 24 publications
(38 citation statements)
references
References 21 publications
0
38
0
Order By: Relevance
“…The Wasserstein barycenter and its computation have been studied in many contexts, such as optimal transport theory (Cuturi and Doucet, 2014;Anderes et al, 2016). In Tabak and Trigila (2018), the Wasserstein barycenter has been suggested as a method to remove nuisance variation in high-throughput biological experiments. Two key ingredients of the Wasserstein barycenter are that (i) the nuisance variation is removed in the sense that a number of distinct distributions are transformed into a common distribution, and hence become indistinguishable; and (ii) the distributions are minimally perturbed by the transformations.…”
Section: General Approachmentioning
confidence: 99%
“…The Wasserstein barycenter and its computation have been studied in many contexts, such as optimal transport theory (Cuturi and Doucet, 2014;Anderes et al, 2016). In Tabak and Trigila (2018), the Wasserstein barycenter has been suggested as a method to remove nuisance variation in high-throughput biological experiments. Two key ingredients of the Wasserstein barycenter are that (i) the nuisance variation is removed in the sense that a number of distinct distributions are transformed into a common distribution, and hence become indistinguishable; and (ii) the distributions are minimally perturbed by the transformations.…”
Section: General Approachmentioning
confidence: 99%
“…In order to provide a more flexible framework for data science applications, sample-based techniques to solve the OT problem were developed in [19,7,20]. A central question to address when posing sample-based OT problems is the meaning of the push-forward condition T # µ = ν when µ and ν are only known through samples {x i }, {y j }.…”
Section: Introductionmentioning
confidence: 99%
“…The authors have recently developed Tabak and Trigila [2017] a unified framework for the explanation of variability, based on extensions of the mathematical theory of optimal transport. In this framework, one estimates the conditional probability distributions ρ(x|z) by mapping them to their Wasserstein barycenter Agueh and Carlier [2011].…”
Section: Introductionmentioning
confidence: 99%
“…In this framework, one estimates the conditional probability distributions ρ(x|z) by mapping them to their Wasserstein barycenter Agueh and Carlier [2011]. It was shown in Tabak and Trigila [2017] that principal components emerge naturally from the methodology's simplest setting, with maps restricted to rigid translations, and hence capturing not the full conditional probability distribution ρ(x|z) but only its conditional expectationx(z). From here stems one of this article's legs: if one thinks of principal components in terms of explanation of variability, it is natural to consider more general scenarios, where the covariates, though still with values a priori unknown, are associated with particular attributes such as space, time or activity networks, and hence required to be smooth in the topologies associated with these attributes.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation