Robust Generative Restricted Kernel Machines Using Weighted Conjugate Feature Duality

Pandey, Arun; Schreurs, Joachim; Suykens, Johan A. K.

doi:10.1007/978-3-030-64583-0_54

Cited by 7 publications

(4 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(2) i of each x i , one can encode an out-of-sample data point x in the manner proposed in (Pandey et al, 2020b) extended to the 2-layer case. The latent representation of x is computed by projecting it on the latent space using: 4)…”

Section: The Constr-drkm Methodsmentioning

confidence: 99%

Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints

Tonin¹,

Patrinos²,

Suykens³

2020

Preprint

Self Cite

View full text Add to dashboard Cite

We introduce Constr-DRKM, a deep kernel method for the unsupervised learning of disentangled data representations. We propose augmenting the original deep restricted kernel machine formulation for kernel PCA by orthogonality constraints on the latent variables to promote disentanglement and to make it possible to carry out optimization without first defining a stabilized objective. After illustrating an end-to-end training procedure based on a quadratic penalty optimization algorithm with warm start, we quantitatively evaluate the proposed method's effectiveness in disentangled feature learning. We demonstrate on four benchmark datasets that this approach performs similarly overall to β-VAE on a number of disentanglement metrics when few training points are available, while being less sensitive to randomness and hyperparameter selection than β-VAE. We also present a deterministic initialization of Constr-DRKM's training algorithm that significantly improves the reproducibility of the results. Finally, we empirically evaluate and discuss the role of the number of layers in the proposed methodology, examining the influence of each principal component in every layer and showing that components in lower layers act as local feature detectors capturing the broad trends of the data distribution, while components in deeper layers use the representation learned by previous layers and more accurately reproduce higher-level features.

show abstract

Section: The Constr-drkm Methodsmentioning

confidence: 99%

Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints

Tonin¹,

Patrinos²,

Suykens³

2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…This, in turn, allows them to be used in outlier and out-of-distribution detection tasks [38]. Finally, the latent space of an RKM can be sampled from to generate new samples from the learned distribution [39,40,41,42].…”

Section: Restricted Kernel Machinesmentioning

confidence: 99%

Multi-view Kernel PCA for Time series Forecasting

Pandey¹,

Meulemeester²,

Moor³

et al. 2023

Preprint

View full text Add to dashboard Cite

In this paper, we propose a kernel principal component analysis model for multi-variate time series forecasting, where the training and prediction schemes are derived from the multi-view formulation of Restricted Kernel Machines. The training problem is simply an eigenvalue decomposition of the summation of two kernel matrices corresponding to the views of the input and output data. When a linear kernel is used for the output view, it is shown that the forecasting equation takes the form of kernel ridge regression. When that kernel is non-linear, a pre-image problem has to be solved to forecast a point in the input space. We evaluate the model on several standard time series datasets, perform ablation studies, benchmark with closely related models and discuss its results. IntroductionKernel methods have seen great success in many applications with very-high dimensional data but low number of samples, and are therefore one of the most popular non-parametric models. In critical machine learning applications, kernel methods are preferred due to their strong theoretical foundation in learning theory [1,2,3,4]. Kernel methods map the data into a high-dimensional (possibly infinite) feature space by using the kernel trick. This kernel trick allows for natural, non-linear extensions to the traditional linear methods in terms of a dual representation using a suitable kernel function. This led to numerous popular methods such as kernel principal component analysis [5], kernel fisher discriminant analysis [6] and the least-squares support vector machine [1].However, when it comes to learning large-scale problems, kernel methods fall behind deep learning techniques due to their time and memory complexity. This also holds in the time series analysis and forecasting domain, which has recently been dominated by specialized deep neural network models [7,8,9,10].Attempts have been made to combine kernel and deep learning methods [11,12], especially for specific cases such as with deep gaussian processes [13] and multi-layer support vector machines [14]. Recently, a new unifying framework, named Restricted Kernel Machines (RKM), was proposed [15] that attempts to bridge kernel methods with deep learning. The Lagrangian function of the Least-Squares Support Vector Machine (LS-SVM) is similar to the energy function of Restricted Boltzmann Machine (RBMs), thereby drawing link between kernel methods and RBMs; hence the name Restricted Kernel Machines. Contribution:In this work, we propose a novel kernel autoregressive time series forecasting model based on the RKM framework, where the training problem is the eigen-decomposition of two kernel matrices. Additionally, we use the same objective function to derive a novel prediction scheme to recursively forecast several steps ahead in the future.

show abstract

“…with P 0 and λ > 0 [46], where the hidden features u regarding each sample are conjugated to the projections . It can be seen that the use of matrix P ∈ R s×s realizes the weighting over the selected s components.…”

Section: Modified Weighted Conjugate Feature Duality For Spectral Clu...mentioning

confidence: 99%

Tensor-based Multi-view Spectral Clustering via Shared Latent Space

Tao¹,

Tonin²,

Patrinos³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Multi-view Spectral Clustering (MvSC) attracts increasing attention due to diverse data sources. However, most existing works are prohibited in out-of-sample predictions and overlook model interpretability and exploration of clustering results. In this paper, a new method for MvSC is proposed via a shared latent space from the Restricted Kernel Machine framework. Through the lens of conjugate feature duality, we cast the weighted kernel principal component analysis problem for MvSC and develop a modified weighted conjugate feature duality to formulate dual variables. In our method, the dual variables, playing the role of hidden features, are shared by all views to construct a common latent space, coupling the views by learning projections from view-specific spaces. Such single latent space promotes well-separated clusters and provides straightforward data exploration, facilitating visualization and interpretation. Our method requires only a single eigendecomposition, whose dimension is independent of the number of views. To boost higher-order correlations, tensor-based modelling is introduced without increasing computational complexity. Our method can be flexibly applied with out-of-sample extensions, enabling greatly improved efficiency for large-scale data with fixed-size kernel schemes. Numerical experiments verify that our method is effective regarding accuracy, efficiency, and interpretability, showing a sharp eigenvalue decay and distinct latent variable distributions.

show abstract

Robust Generative Restricted Kernel Machines Using Weighted Conjugate Feature Duality

Cited by 7 publications

References 15 publications

Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints

Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints

Multi-view Kernel PCA for Time series Forecasting

Tensor-based Multi-view Spectral Clustering via Shared Latent Space

Contact Info

Product

Resources

About