Guangliang Chen scite author profile

This paper presents novel techniques for improving the performance of a multi-way spectral clustering framework (Govindu Lerman, 2007, preprint in the supplementary webpage) for segmenting affine subspaces. Specifically, it suggests an iterative sampling procedure to improve the uniform sampling strategy, an automatic scheme of inferring the tuning parameter from the data, a precise initialization procedure for K-means, as well as a simple strategy for isolating outliers. The resulting algorithm, Spectral Curvature Clustering (SCC), requires only linear storage and takes linear running time in the size of the data. It is supported by theory which both justifies its successful performance and guides our practical choices. We compare it with other existing methods on a few artificial instances of affine subspaces. Application of the algorithm to several real-world problems is also discussed.

show abstract

Microbiota from Obese Mice Regulate Hematopoietic Stem Cell Differentiation by Altering the Bone Niche

Luo

et al. 2015

View full text Add to dashboard Cite

The effect of metabolic stress on the bone marrow microenvironment is poorly defined. We show that high-fat diet (HFD) decreased long-term Lin(-)Sca-1(+)c-Kit(+) (LSK) stem cells and shifted lymphoid to myeloid cell differentiation. Bone marrow niche function was impaired after HFD as shown by poor reconstitution of hematopoietic stem cells. HFD led to robust activation of PPARγ2, which impaired osteoblastogenesis while enhancing bone marrow adipogenesis. At the same time, expression of genes such as Jag-1, SDF-1, and IL-7 forming the bone marrow niche was highly suppressed after HFD. Moreover, structural changes of microbiota were associated to HFD-induced bone marrow changes. Antibiotic treatment partially rescued HFD-mediated effects on the bone marrow niche, while transplantation of stools from HFD mice could transfer the effect to normal mice. These findings show that metabolic stress affects the bone marrow niche by alterations of gut microbiota and osteoblast-adipocyte homeostasis.

show abstract

Spectral clustering based on local linear approximations

Arias-Castro¹,

Chen²,

Lerman³

2011

Electron. J. Statist.

134

View full text Add to dashboard Cite

In the context of clustering, we assume a generative model where each cluster is the result of sampling points in the neighborhood of an embedded smooth surface; the sample may be contaminated with outliers, which are modeled as points sampled in space away from the clusters. We consider a prototype for a higher-order spectral clustering method based on the residual from a local linear approximation. We obtain theoretical guarantees for this algorithm and show that, in terms of both separation and robustness to outliers, it outperforms the standard spectral clustering algorithm (based on pairwise distances) of Ng, Jordan and Weiss (NIPS '01). The optimal choice for some of the tuning parameters depends on the dimension and thickness of the clusters. We provide estimators that come close enough for our theoretical purposes. We also discuss the cases of clusters of mixed dimensions and of clusters that are generated from smoother surfaces. In our experiments, this algorithm is shown to outperform pairwise spectral clustering on both simulated and real data

show abstract

Multi-scale geometric methods for data sets II: Geometric Multi-Resolution Analysis

Allard

Chen

Maggioni

2012

Applied and Computational Harmonic Analysis

109

138

View full text Add to dashboard Cite

Data sets are often modeled as samples from a probability distribution in R D , for D large. It is often assumed that the data has some interesting low-dimensional structure, for example that of a d-dimensional manifold M, with d much smaller than D. When M is simply a linear subspace, one may exploit this assumption for encoding efficiently the data by projecting onto a dictionary of d vectors in R D (for example found by SVD), at a cost (n + D)d for n data points. When M is nonlinear, there are no "explicit" and algorithmically efficient constructions of dictionaries that achieve a similar efficiency: typically one uses either random dictionaries, or dictionaries obtained by black-box global optimization. In this paper we construct data-dependent multi-scale dictionaries that aim at efficiently encoding and manipulating the data. Their construction is fast, and so are the algorithms that map data points to dictionary coefficients and vice versa, in contrast with L 1 -type sparsity-seeking algorithms, but alike adaptive nonlinear approximation in classical multiscale analysis. In addition, data points are guaranteed to have a compressible representation in terms of the dictionary, depending on the assumptions on the geometry of the underlying probability distribution.

show abstract

Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

Chen

2009

Found Comput Math

View full text Add to dashboard Cite

The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu's multi-way spectral clustering framework (CVPR 2005) and Ng et al.'s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the within-cluster errors, the between-clusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guangliang Chen

Spectral Curvature Clustering (SCC)

Microbiota from Obese Mice Regulate Hematopoietic Stem Cell Differentiation by Altering the Bone Niche

Spectral clustering based on local linear approximations

Multi-scale geometric methods for data sets II: Geometric Multi-Resolution Analysis

Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

Contact Info

Product

Resources

About