2018
DOI: 10.48550/arxiv.1805.07412
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Wasserstein Measure Coresets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…Furthermore, this problem formation ignores the fact that a dataset is an empirical sample of data distribution, which describes a learning task. To build the bridge between coreset and Wasserstein distance, we introduce the idea of Wasserstein coreset (Claici, Genevay, and Solomon 2020) and follow the notations and definitions of coreset and measure coreset. Then we have the following proposition:…”
Section: Methodsmentioning
confidence: 99%
“…Furthermore, this problem formation ignores the fact that a dataset is an empirical sample of data distribution, which describes a learning task. To build the bridge between coreset and Wasserstein distance, we introduce the idea of Wasserstein coreset (Claici, Genevay, and Solomon 2020) and follow the notations and definitions of coreset and measure coreset. Then we have the following proposition:…”
Section: Methodsmentioning
confidence: 99%
“…Sensitivity-based methods, such as the k-clustering problems take into account the importance of samples by approximate probability (Bachem, Lucic, and Lattanzi 2018;Bateni et al 2014). Distributionbased methods typically require consideration of the underlying data distribution, such as designing the coreset based on Reproducing Kernel Hilbert Space (RKHS) theory (Chen, Welling, and Smola 2012) or utilizing the integral probability metric in the context of optimal transport theory (Claici, Genevay, and Solomon 2018). However, these traditional coreset methods (Feldman, Faulkner, and Krause 2011;Bachem, Lucic, and Krause 2015;Zhang et al 2023) face challenges due to their high computational complexity and reliance on fixed data representations (which are seldom suited for image data).…”
Section: Related Workmentioning
confidence: 99%
“…Constructing a core-set from a large dataset is an optimization problem of finding a smaller set that can best approximate the original dataset with respect to a certain measure. Claici et al [8] leveraged optimal transport theory and introduced Wasserstein measure to calculate the core-set. Their work aims to minimize the Wasserstein distance of the core-set from a given input data distribution.…”
Section: Problem 3: Transport-based Coresetmentioning
confidence: 99%