2016
DOI: 10.48550/arxiv.1612.00889
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

New Frameworks for Offline and Streaming Coreset Constructions

Abstract: Let P be a set (called points), Q be a set (called queries) and a function f : P ×Q → [0, ∞) (called cost). For an error parameter > 0, a set S ⊆ P with a weight function w : P → [0, ∞) is an ε-coreset if s∈S w(s)f (s, q) approximates p∈P f (p, q) up to a multiplicative factor of 1 ± ε for every given query q ∈ Q. Coresets are used to solve fundamental problems in machine learning of streaming and distributed data.We construct coresets for the k-means clustering of n input points, both in an arbitrary metric s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
97
0
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 52 publications
(98 citation statements)
references
References 29 publications
0
97
0
1
Order By: Relevance
“…If û assigns fractional weights, then some elements of B might be assigned to more than one label, as long as the sum of weights over every element in B is 1. By construction, ( Ŝ, ŵ) satisfies Properties ( 10)- (11). Hence, ( Ŝ, ŵ) is a smoothed version of Ĉ, û).…”
Section: Coreset Constructionmentioning
confidence: 94%
See 3 more Smart Citations
“…If û assigns fractional weights, then some elements of B might be assigned to more than one label, as long as the sum of weights over every element in B is 1. By construction, ( Ŝ, ŵ) satisfies Properties ( 10)- (11). Hence, ( Ŝ, ŵ) is a smoothed version of Ĉ, û).…”
Section: Coreset Constructionmentioning
confidence: 94%
“…Furthermore, a coreset for a family of classifiers is many times a "silver bullet" that provides a unified solution to all Challenges (i)-(iv) above. Combining the two main coreset properties: merge and reduce [32,7,26,1], which are usually satisfied, with the fact that a coreset approximates every model, and not just the optimal model, enables it to support streaming and distributed data [11,41], parallel computation [21], handle constrained versions of the problem [25], model compression [20], parameter tuning [43] and more.…”
Section: Coresetsmentioning
confidence: 99%
See 2 more Smart Citations
“…In result, they address a spectrum of clustering problems, such as k-median clustering, k-line median clustering, projective clustering and also other problems like subspace approximation. Braverman et al [3] improved the aforementioned framework by switching to (ε, η)-approximations, which leads to substantially smaller sample sizes in many cases. Also, they simplified and further generalized the framework and applied it to k-means clustering of points in R d .…”
Section: Related Workmentioning
confidence: 99%