Michal Moshkovitz scite author profile

Michal Moshkovitz

5Publications

43Citation Statements Received

170Citation Statements Given

How they've been cited

How they cite others

169

Affiliations

Hebrew University of Jerusalem

Publications

Order By: Most citations

ExKMC: Expanding Explainable $k$-Means Clustering

Frost¹,

Moshkovitz²,

Rashtchian³

2020

Preprint

View full text Add to dashboard Cite

Despite the popularity of explainable AI, there is limited work on effective methods for unsupervised learning. We study algorithms for k-means clustering, focusing on a trade-off between explainability and accuracy. Following prior work, we use a small decision tree to partition a dataset into k clusters. This enables us to explain each cluster assignment by a short sequence of single-feature thresholds. While larger trees produce more accurate clusterings, they also require more complex explanations. To allow flexibility, we develop a new explainable k-means clustering algorithm, ExKMC, that takes an additional parameter k ≥ k and outputs a decision tree with k leaves. We use a new surrogate cost to efficiently expand the tree and to label the leaves with one of k clusters. We prove that as k increases, the surrogate cost is non-increasing, and hence, we trade explainability for accuracy. Empirically, we validate that ExKMC produces a low cost clustering, outperforming both standard decision tree methods and other algorithms for explainable clustering. Implementation of ExKMC available at https://github.com/navefr/ExKMC.

show abstract

Unexpected Effects of Online no-Substitution k-means Clustering

Moshkovitz¹

2019

Preprint

View full text Add to dashboard Cite

In this paper we study k-means clustering in the online setting. In the offline setting the main parameters are number of centers, k, and size of the dataset, n. Performance guarantees are given as a function of these parameters. In the online setting new factors come into place: the ordering of the dataset and whether n is known in advance or not. One of the main results of this paper is the discovery that these new factors have dramatic effects on the quality of the clustering algorithms. For example, for constant k: (1) Ω(n) centers are needed if the order is arbitrary, (2) if the order is random and n is unknown in advance, the number of centers reduces to Θ(log n), and (3) if n is known, then the number of centers reduces to a constant. For different values of the new factors, we show upper and lower bounds that are exactly the same up to a constant, thus achieving optimal bounds.

show abstract

A Constant Approximation Algorithm for Sequential Random-Order No-Substitution k-Median Clustering

Hess¹,

Moshkovitz²,

Sabato³

2021

Preprint

View full text Add to dashboard Cite

We study k-median clustering under the sequential no-substitution setting. In this setting, a data stream is sequentially observed, and some of the points are selected by the algorithm as cluster centers. However, a point can be selected as a center only immediately after it is observed, before observing the next point. In addition, a selected center cannot be substituted later. We give a new algorithm for this setting that obtains a constant approximation factor on the optimal risk under a random arrival order. This is the first such algorithm that holds without any assumptions on the input data and selects a non-trivial number of centers. The number of selected centers is quasi-linear in k. Our algorithm and analysis are based on a careful risk estimation that avoids outliers, a new concept of a linear bin division, and repeated calculations using an offline clustering algorithm.

show abstract

Framework for Evaluating Faithfulness of Local Explanations

Dasgupta¹,

Frost²,

Moshkovitz³

2022

Preprint

View full text Add to dashboard Cite

We study the faithfulness of an explanation system to the underlying prediction model. We show that this can be captured by two properties, consistency and sufficiency, and introduce quantitative measures of the extent to which these hold. Interestingly, these measures depend on the test-time data distribution. For a variety of existing explanation systems, such as anchors, we analytically study these quantities. We also provide estimators and sample complexity bounds for empirically determining the faithfulness of black-box explanation systems. Finally, we experimentally validate the new properties and estimators.

show abstract

A General Memory-Bounded Learning Algorithm

Moshkovitz¹,

Tishby²

2017

Preprint

View full text Add to dashboard Cite

Designing bounded-memory algorithms is becoming increasingly important nowadays. Previous works studying bounded-memory algorithms focused on proving impossibility results, while the design of bounded-memory algorithms was left relatively unexplored. To remedy this situation, in this work we design a general bounded-memory learning algorithm, when the underlying distribution is known. The core idea of the algorithm is not to save the exact example received, but only a few important bits that give sufficient information. This algorithm applies to any hypothesis class that has an "anti-mixing" property. This paper complements previous works on unlearnability with bounded memory and provides a step towards a full characterization of bounded-memory learning.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.