Quantized Compressive K-Means

Schellekens, Vincent; Jacques, Laurent

doi:10.1109/lsp.2018.2847908

Cited by 16 publications

(12 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although we focused on k-means clustering, other learning tasks can be solved in a compressive manner and should be investigated, such as Gaussian mixtures fitting or principal components analysis. We leave for future work the idea of using quantized sketches [26] (quantization for privacy has already been considered [27,28]), and leveraging fast transforms to speed-up the process [29]. Using additive noise on the data samples themselves is also a possibility that should be investigated.…”

Section: Resultsmentioning

confidence: 99%

Differentially Private Compressive K-means

Schellekens¹,

Chatalic

Houssiau

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

This work addresses the problem of learning from large collections of data with privacy guarantees. The sketched learning framework proposes to deal with the large scale of datasets by compressing them into a single vector of generalized random moments, from which the learning task is then performed. We modify the standard sketching mechanism to provide differential privacy, using addition of Laplace noise combined with a subsampling mechanism (each moment is computed from a subset of the dataset). The data can be divided between several sensors, each applying the privacy-preserving mechanism locally, yielding a differentially-private sketch of the whole dataset when reunited. We apply this framework to the k-means clustering problem, for which a measure of utility of the mechanism in terms of a signal-to-noise ratio is provided, and discuss the obtained privacy-utility tradeoff.

show abstract

Section: Resultsmentioning

confidence: 99%

Differentially Private Compressive K-means

Schellekens¹,

Chatalic

Houssiau

et al. 2019

ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

show abstract

“…[25] Shape recognition Fuzzy k-means clustering ensemble (FKMCE). [26] Signal processing Compressive k-means clustering (CKM).…”

Section: Referencementioning

confidence: 99%

The k-means Algorithm: A Comprehensive Survey and Performance Evaluation

2020

View full text Add to dashboard Cite

The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community. However, despite its popularity, the algorithm has certain limitations, including problems associated with random initialization of the centroids which leads to unexpected convergence. Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects. A fundamental problem of the k-means algorithm is its inability to handle various data types. This paper provides a structured and synoptic overview of research conducted on the k-means algorithm to overcome such shortcomings. Variants of the k-means algorithms including their recent developments are discussed, where their effectiveness is investigated based on the experimental analysis of a variety of datasets. The detailed experimental analysis along with a thorough comparison among different k-means clustering algorithms differentiates our work compared to other existing survey papers. Furthermore, it outlines a clear and thorough understanding of the k-means algorithm along with its different research directions.

show abstract

“…The effect of dithering is to make the quantized Φ q behave similarly to non-quantized Φ RF on average. For instance, it was shown in [43] that for each W , x, x , and ξ,…”

Section: Sketching With Quantized Contributionsmentioning

confidence: 99%

Sketching Data Sets for Large-Scale Learning: Keeping only what you need

Gribonval

Chatalic²,

Keriven

et al. 2021

IEEE Signal Process. Mag.

Self Cite

View full text Add to dashboard Cite

Big data can be a blessing: with very large training datasets it becomes possible to perform complex learning tasks with unprecedented accuracy. Yet, this improved performance comes at the price of enormous computational challenges. Thus, one may wonder: Is it possible to leverage the information content of huge datasets while keeping computational resources under control? Can this also help solve some of the privacy issues raised by large-scale learning? This is the ambition of compressive learning, where the dataset is massively compressed before learning. Here, a "sketch" is first constructed by computing carefully chosen nonlinear random features (e.g., random Fourier features) and averaging them over the whole dataset. Parameters are then learned from the sketch, without access to the original dataset. This article surveys the current state-of-the-art in compressive learning, including the main concepts and algorithms; their connections with established signal-processing methods; existing theoretical guarantees, on both information preservation and privacy preservation; and important open problems. For an extended version of this article that contains additional references and more in-depth discussions on a variety of topics, see [1]. papers in international journals, 80 conference proceedings and presentations in signal and image processing conferences, and 4 book chapters.

show abstract

Quantized Compressive K-Means

Cited by 16 publications

References 28 publications

Differentially Private Compressive K-means

Differentially Private Compressive K-means

The k-means Algorithm: A Comprehensive Survey and Performance Evaluation

Sketching Data Sets for Large-Scale Learning: Keeping only what you need

Contact Info

Product

Resources

About