Multiscale data sampling and function extension

Bermanis, Amit; Averbuch, Amir; Coifman, Ronald R.

doi:10.1016/j.acha.2012.03.002

Cited by 56 publications

(86 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Instead of using a single large bounding box, we use a finite set of small constant-volume boxes that cover the dataset (or its underlying manifold), and use the minimal cover to provide a cover-based bound. When the constant size of the boxes is large enough to cover the whole dataset with one box, this bound converges to the one in [3]. Thus, it is at least as tight as this already established one.…”

Section: Introductionsupporting

confidence: 52%

“…A classical kernel-based technique is the Nyström extension [14,1]. More recent methods are Geometric Harmonics [6] and the Multiscale Extension in [3]. These methods use the spectral decomposition of the kernel (i.e., its eigenvalues and eigenvectors) as a basis of its range.…”

Section: Introductionmentioning

confidence: 99%

“…[5,2]). Such an upper bound was achieved in [3] based on a bounding box volume of the analyzed dataset in the observable ambient space. We refine this bound by considering the underlying geometry that is provided by the underlying manifold from which the dataset is assumed to be sampled.…”

Section: Introductionmentioning

confidence: 99%

“…This kernel introduces the notion of affinities and local neighborhoods of data points in the dataset M (or on the manifold M) due to the exponential decay of it's values in relation to the distances between data points. The Gaussian kernel with its spectral analysis and its spectral decomposition are utilized for dimensionality reduction in [5,7,2] and for out-of-sample function extension in [6,3].…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Cover-based bounds on the numerical rank of Gaussian kernels

Bermanis

Wolf

Averbuch

2014

Applied and Computational Harmonic Analysis

Self Cite

View full text Add to dashboard Cite

A popular approach for analyzing high-dimensional datasets is to perform dimensionality reduction by applying non-parametric affinity kernels. Usually, it is assumed that the represented affinities are related to an underlying lowdimensional manifold from which the data is sampled. This approach works under the assumption that, due to the low-dimensionality of the underlying manifold, the kernel has a low numerical rank. Essentially, this means that the kernel can be represented by a small set of numerically-significant eigenvalues and their corresponding eigenvectors.We present an upper-bound for the numerical rank of Gaussian convolution operators, which are commonly used as kernels by spectral manifoldlearning methods. The achieved bound is based on the underlying geometry that is provided by the manifold from which the dataset is assumed to be sampled. The bound can be used to determine the number of significant eigenvalues/eigenvectors that are needed for spectral analysis purposes. Furthermore, the results in this paper provide a relation between the underlying geometry of the manifold (or dataset) and the numerical rank of its Gaussian affinities.The term cover-based bound is used because the computations of this bound are done by using a finite set of small constant-volume boxes that cover the underlying manifold (or the dataset). We present bounds for finite Gaussiankernel matrices as well as for the continuous Gaussian convolution operator. We explore and demonstrate the relations between the bounds that are achieved for finite and continuous cases. The cover-oriented methodology is also used to provide a relation between the geodesic length of a curve and the numerical rank of Gaussian kernel of datasets that are sampled from it.

show abstract

Section: Introductionsupporting

confidence: 52%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Cover-based bounds on the numerical rank of Gaussian kernels

Bermanis

Wolf

Averbuch

2014

Applied and Computational Harmonic Analysis

Self Cite

View full text Add to dashboard Cite

show abstract

“…This approximation only considers values of the vector field u at the data points in M , which can be computed in advance by using the pseudo inverse of the super-kernel G. This phase is not complicated, but it is beyond the scope of this paper since it is not crucial for the presented dictionary construction. Therefore, this provides a feasible out-of-sample extension of a vector field, which is similar to the methods shown in [3], [5] for the scalar case.…”

Section: B Out-of-sample Extension For Vector Fieldsmentioning

confidence: 71%

Dictionary Construction for Patch-to-Tensor Embedding

Salhov

Wolf

Bermanis

et al. 2012

Advances in Intelligent Data Analysis XI

Self Cite

View full text Add to dashboard Cite

The incorporation of a matrix relations, which encompass multidimensional similarities between local neighborhoods of data points in the underlying manifold of a data, improves the utilization of kernel based data analysis methodologies. However, the utilization of multidimensional similarities results in a larger kernel and hence the computational complexity of the corresponding spectral decomposition increases dramatically. In this paper, we propose an efficient approximation to a spectral decomposition of a multidimensional similarity based kernel. Furthermore, we propose a dictionary construction that approximates the oversized kernel in this case and its associated embedding. The performance of the proposed dictionary construction is demonstrated on an example of a super-kernel that utilizes the Diffusion Maps methodology together with linear-projection operators between tangent spaces in the manifold. I. INTRODUCTIONRecent methods for advanced massive high dimensional data analysis utilize a manifold structure on which data points are assumed to lie. This manifold is immersed (or submersed) in an ambient space that is defined by observable parameters. Kernel methods such as k-PCA and Diffusion Maps (DM) [4] have provided good results in analyzing such massive high dimensional data. The defined kernel can be thought of as an adjacency matrix of a graph whose vertices are the data points in the dataset. The analysis of the eigenvalues and the corresponding eigenvectors of this matrix reveals many properties and connections in the graph. These methods are based on the spectral decomposition of a kernel that was designed to incorporate a scalar similarity measure between data points. The resulting embedding of the data points into an Euclidean space preserves the qualities represented by the designed kernel. This approach extends the core of the classical Multi-Dimensional Scaling (MDS) method [6], [9] by considering non-linear relations instead of just a linear one in its original Gram matrix.Recently, DM was extended in several different ways to handle the orientation in local tangent spaces [10]- [13]. The relation between two patches is described by a matrix instead of a scalar value. The resulting kernel captures enriched similarities between local structures in the underlying manifold. These enriched similarities can be used to analyze local areas around data points instead of analyzing their specific locations. For example, this analysis can be beneficial in image processing (analyzing regions instead of individual pixels) and when the data points are perturbed so that their surrounding area is more important than their specific position. Since the constructions of these similarities are based on local tangent spaces, they provide methods to manipulate tangential vector fields (e.g., perform out-ofsample extensions). These manipulations are beneficial when the analyzed data consists of directional information in addition to positional information on the manifold. For example, the goal in [2] is ...

show abstract

Grassmannian diffusion maps based surrogate modeling via geometric harmonics

Santos

Giovanis

Kontolati

et al. 2022

Numerical Meth Engineering

View full text Add to dashboard Cite

A novel surrogate model based on the Grassmannian diffusion maps (GDMaps) and utilizing geometric harmonics (GH) is developed for predicting the response of complex physical phenomena. The method utilizes GDMaps to obtain a low-dimensional representation of the underlying behavior of physical/ mathematical systems with respect to uncertain input parameters. Using this representation, GH, an out-of-sample extension technique, is employed to create a global map from the input parameter space to a Grassmannian diffusion manifold. GH is further employed to locally map points on the diffusion manifold onto the tangent space of a Grassmann manifold. The exponential map is then used to project the points in the tangent space onto the Grassmann manifold, where reconstruction of the full solution is performed. The performance of the proposed surrogate model is verified with three examples. The first problem is a toy example used to illustrate the technique. In the second example, errors associated with the various mappings are assessed by studying response predictions of the electric potential of a dielectric cylinder in a homogeneous electric field. The last example applies the method for uncertainty prediction in the strain field evolution in a model amorphous material using the shear transformation zone theory of plasticity.

show abstract

Multiscale data sampling and function extension

Cited by 56 publications

References 9 publications

Cover-based bounds on the numerical rank of Gaussian kernels

Cover-based bounds on the numerical rank of Gaussian kernels

Dictionary Construction for Patch-to-Tensor Embedding

Grassmannian diffusion maps based surrogate modeling via geometric harmonics

Contact Info

Product

Resources

About