Statistical properties of sketching algorithms

Ahfock, Daniel; Astle, William; Richardson, Sylvia

doi:10.48550/arxiv.1706.03665

Cited by 11 publications

(18 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(D) puts restriction on the smallest eigenvalue of the matrix X n,ξ Xn,ξ /m n . Notably, Gaussian sketching approximately preserves the isometry condition (Ahfock et al, 2017), so that ∃ η 0 > 0 with the property that e min ( X n,ξ Xn,ξ /m n ) ≥ η 0 e min (X n,ξ X n,ξ /n) with probability f n depending on m n and p n . This, together with assumption A 1 (3) in Song and Liang (2017) is used to argue that assumption (D) is satisfied with a positive probability.…”

Section: Assumptions Framework and The Main Resultsmentioning

confidence: 99%

“…where the inequality in the fourth line follows from the fact that there exists η > 0 such that || Xn v|| 2 2 ≤ η||X n v|| 2 2 , for all v (Ahfock et al, 2017). This implies that e max ( X n Xn ) = sup…”

Section: Discussionmentioning

confidence: 95%

“…First, by Lemma 7.1, e min ((Φ n Φ n ) −1 ) ≥ n/( √ n + √ m n + o( √ n)) 2 almost surely. Second, e min ( X n,ξ Xn,ξ /m n ) ≥ e min (X n,ξ X n,ξ /n)η, for some η > 0, by Ahfock et al (2017). Thus, using Assumption (D), it follows that e min ( X n,ξ Xn,ξ /m n ) ≥ C4 , for some constant C4 > 0 and for all ξ ⊃ ξ * such that |ξ| ≤ s n + sn .…”

Section: Condition (Ii): Formentioning

confidence: 98%

“…Chowdhury et al (2018) propose a data-dependent algorithm in light of the ridge leverage scores. Other related works include Ailon and Chazelle (2006); Drineas et al (2011); Raskutti and Mahoney (2016); Ahfock et al (2017); Huang (2018). To the best of our knowledge, we are the first to offer efficient and principled Bayesian computation algorithm with linear regressions involving large n and p using data sketching.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Sketching in Bayesian High Dimensional Regression With Big Data Using Gaussian Scale Mixture Priors

Guhaniyogi,

Scheffler

2021

Preprint

View full text Add to dashboard Cite

Bayesian computation of high dimensional linear regression models with a popular Gaussian scale mixture prior distribution using Markov Chain Monte Carlo (MCMC) or its variants can be extremely slow or completely prohibitive due to the heavy computational cost that grows in the order of p 3 , with p as the number of features. Although a few recently developed algorithms make the computation efficient in presence of a small to moderately large sample size (with the complexity growing in the order of n 3 ), the computation becomes intractable when sample size n is also large. In this article we adopt the data sketching approach to compress the n original samples by a random linear transformation to m << n samples in p dimensions, and compute Bayesian regression with Gaussian scale mixture prior distributions with the randomly compressed response vector and feature matrix. Our proposed approach yields computational complexity growing in the cubic order of m. Another important motivation for this compression procedure is that it anonymizes the data by revealing little information about the original data in the course of analysis. Our detailed empirical investigation with the Horseshoe prior from the class of Gaussian scale mixture priors shows closely similar inference and a massive reduction in per iteration computation time of the proposed approach compared to the regression with the full sample. One notable contribution of this article is to derive posterior contraction rate for high dimensional predictor coefficient with a general class of shrinkage priors on them under data compression/sketching. In particular, we characterize the dimension of the compressed response vector m as a function of the sample size, number of predictors and sparsity in the regression to guarantee accurate estimation of predictor coefficients asymptotically, even after data compression.

show abstract

Section: Assumptions Framework and The Main Resultsmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 95%

Section: Condition (Ii): Formentioning

confidence: 98%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Sketching in Bayesian High Dimensional Regression With Big Data Using Gaussian Scale Mixture Priors

Guhaniyogi,

Scheffler

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…This procedure can be performed by multiplying a given matrix by a random matrix from the right-hand or left-hand side. This is called random projection and it has been shown that this procedure preserves the Euclidean distance among the points approximately [78], [79]. To be precise, let us consider matrix X and target rank R In the first stage of the random projection approach, we generate a random matrix Ω = [ω 1 , ω 2 , .…”

mentioning

confidence: 99%

Randomized Algorithms for Computation of Tucker decomposition and Higher Order SVD (HOSVD)

Ahmadi‐Asl¹,

Abukhovich²,

Asante-Mensah³

et al. 2020

Preprint

View full text Add to dashboard Cite

Big data analysis has become a crucial part of new emerging technologies such as Internet of thing (IOT), cyber-physical analysis, deep learning, anomaly detection etc. Among many other techniques, dimensionality reduction plays a key role in such analyses and facilitate the procedure of feature selection and feature extraction. Randomized algorithms are efficient tools for handling big data tensors. They accelerate decomposing large-scale data tensors by reducing the computational complexity of deterministic algorithms and also reducing the communication among different levels of memory hierarchy which is a main bottleneck in modern computing environments and architectures. In this paper, we review recent advances in randomization for computation of Tucker decomposition and Higher Order SVD (HOSVD). We discuss both random projection and sampling approaches and also single-pass and multi-pass randomized algorithms and how they can be utilized in computation of Tucker decomposition and HOSVD. Simulations on real data including weight tensors of fully connected layers of pretrained VGG-16 and VGG-19 deep neural networks and also CIFAR-10 and CIFAR-100 datasets are provided to compare performance of some of the presented algorithms.

show abstract

On principal components regression, random projections, and column subsampling

Slawski¹

2018

Electron. J. Statist.

View full text Add to dashboard Cite

Principal Components Regression (PCR) is a traditional tool for dimension reduction in linear regression that has been both criticized and defended. One concern about PCR is that obtaining the leading principal components tends to be computationally demanding for large data sets. While random projections do not possess the optimality properties of the leading principal subspace, they are computationally appealing and hence have become increasingly popular in recent years. In this paper, we present an analysis showing that for random projections satisfying a Johnson-Lindenstrauss embedding property, the prediction error in subsequent regression is close to that of PCR, at the expense of requiring a slightly large number of random projections than principal components. Column sub-sampling constitutes an even cheaper way of randomized dimension reduction outside the class of Johnson-Lindenstrauss transforms. We provide numerical results based on synthetic and real data as well as basic theory revealing differences and commonalities in terms of statistical performance. randomizedpcr_long-v0arxiv.tex

show abstract

Statistical properties of sketching algorithms

Cited by 11 publications

References 30 publications

Sketching in Bayesian High Dimensional Regression With Big Data Using Gaussian Scale Mixture Priors

Sketching in Bayesian High Dimensional Regression With Big Data Using Gaussian Scale Mixture Priors

Randomized Algorithms for Computation of Tucker decomposition and Higher Order SVD (HOSVD)

On principal components regression, random projections, and column subsampling

Contact Info

Product

Resources

About