The Fast Cauchy Transform and Faster Robust Linear Regression

Clarkson, Kenneth L.; Drineas, Petros; Magdon‐Ismail, Malik; Mahoney, Michael W.; Meng, Xiangrui; Woodruff, David P.

doi:10.1137/1.9781611973105.34

Cited by 42 publications

(108 citation statements)

References 25 publications

Supporting

Mentioning

105

Contrasting

Order By: Relevance

“…Note, that the embedding dimension for p > 2 is n 1−2∕p poly(d) which improved upon the previous n∕poly(d) and is close to optimal given the lower bound of (n 1−2∕p ) [86]. The desirable (1 ± ) distortion can be achieved using the embeddings for preconditioning and sampling proportional to the p leverage scores [30,34,96].…”

Section: Lemma 11 (Distributional Johnson-lindenstrauss Lemma) There mentioning

confidence: 81%

“…A first step was done by Woodruff and Sohler [93] who designed the first subspace embedding for 1 via Cauchy random variables. The method is in principle generalizable to using p-stable distributions and was improved in [30,77]. The idea is that the sum of such random variables forms again a random variable from the same type of distribution leading to concentration results for the p norm under study.…”

Section: Lemma 11 (Distributional Johnson-lindenstrauss Lemma) There mentioning

confidence: 99%

“…[96]. The first attempts to generalize to p > 2 [30] had nearly linear size, namely n∕poly(d) , which clearly was not satisfying. A remedy came with a manuscript of Andoni [13], who discovered the max stability of inverse exponential random variables as a means to embed p , p > 2 with little distortion into ∞ .…”

Section: Lemma 11 (Distributional Johnson-lindenstrauss Lemma) There mentioning

confidence: 99%

See 2 more Smart Citations

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Munteanu

Schwiegelshohn

2017

Künstl Intell

View full text Add to dashboard Cite

show abstract

Section: Lemma 11 (Distributional Johnson-lindenstrauss Lemma) There mentioning

confidence: 81%

Section: Lemma 11 (Distributional Johnson-lindenstrauss Lemma) There mentioning

confidence: 99%

Section: Lemma 11 (Distributional Johnson-lindenstrauss Lemma) There mentioning

confidence: 99%

See 1 more Smart Citation

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Munteanu

Schwiegelshohn

2017

Künstl Intell

View full text Add to dashboard Cite

show abstract

“…However, as in the cycle case, rather than evaluating everỹ f λ1,...,λt to find the minimum, it is possible to find the minimum more efficiently. One option is to exploit the convexity of f as in Section 3 using a recursive regression algorithm [13] or to use recent results on robust regression via sub-space embeddings [6,15].…”

Section: Proof Consider a Bijection π Betweenmentioning

confidence: 99%

Sketching Earth-Mover Distance on Graph Metrics

McGregor¹,

Stubbs²

2013

Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques

View full text Add to dashboard Cite

Abstract. We develop linear sketches for estimating the Earth-Mover distance between two point sets, i.e., the cost of the minimum weight matching between the points according to some metric. While Euclidean distance and Edit distance are natural measures for vectors and strings respectively, Earth-Mover distance is a well-studied measure that is natural in the context of visual or metric data. Our work considers the case where the points are located at the nodes of an implicit graph and define the distance between two points as the length of the shortest path between these points. We first improve and simplify an existing result by Brody et al. [4] for the case where the graph is a cycle. We then generalize our results to arbitrary graph metrics. Our approach is to recast the problem of estimating Earth-Mover distance in terms of an 1 regression problem. The resulting linear sketches also yield space-efficient data stream algorithms in the usual way.

show abstract

“…In Table 1, RLA with algorithmic leveraging (RLA for short) [Clarkson et al, 2013, Yang et al, 2014] is a popular method for obtaining a low-precision solution and randomized IPCPM is an iterative method for finding a higher-precision solution [Meng and Mahoney, 2013b] for unconstrained ℓ 1 regression. Clearly, pwSGD has a uniformly better complexity than that of RLA methods in terms of both d and ε , no matter which underlying preconditioning method is used.…”

Section: Introductionmentioning

confidence: 99%

Weighted SGD for ℓ_p Regression with Randomized Preconditioning

Yang¹,

Chow²,

Ré³

et al. 2015

Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms

Self Cite

View full text Add to dashboard Cite

In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. SGD methods are easy to implement and applicable to a wide range of convex optimization problems. In contrast, RLA algorithms provide much stronger performance guarantees but are applicable to a narrower class of problems. We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems—e.g., ℓ2 and ℓ1 regression problems. We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system.By rewriting a deterministic ℓp regression problem as a stochastic optimization problem, we connect pwSGD to several existing ℓp solvers including RLA methods with algorithmic leveraging (RLA for short).We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Such SGD convergence rates are superior to other related SGD algorithm such as the weighted randomized Kaczmarz algorithm.Particularly, when solving ℓ1 regression with size n by d, pwSGD returns an approximate solution with ε relative error in the objective value in 𝒪(log n·nnz(A)+poly(d)/ε2) time. This complexity is uniformly better than that of RLA methods in terms of both ε and d when the problem is unconstrained. In the presence of constraints, pwSGD only has to solve a sequence of much simpler and smaller optimization problem over the same constraints. In general this is more efficient than solving the constrained subproblem required in RLA.For ℓ2 regression, pwSGD returns an approximate solution with ε relative error in the objective value and the solution vector measured in prediction norm in 𝒪(log n·nnz(A)+poly(d) log(1/ε)/ε) time. We show that for unconstrained ℓ2 regression, this complexity is comparable to that of RLA and is asymptotically better over several state-of-the-art solvers in the regime where the desired accuracy ε, high dimension n and low dimension d satisfy d ≥ 1/ε and n ≥ d2/ε. We also provide lower bounds on the coreset complexity for more general regression problems, indicating that still new ideas will be needed to extend similar RLA preconditioning ideas to weighted SGD algorithms for more general regression problems. Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets, and the results are consistent with our theoretical findings and demonstrate that pwSGD converges to a medium-precision solution, e.g., ε = 10−3, more quickly.

show abstract

The Fast Cauchy Transform and Faster Robust Linear Regression

Cited by 42 publications

References 25 publications

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Sketching Earth-Mover Distance on Graph Metrics

Weighted SGD for ℓ_p Regression with Randomized Preconditioning

Contact Info

Product

Resources

About

The Fast Cauchy Transform and Faster Robust Linear Regression

Cited by 42 publications

References 25 publications

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Sketching Earth-Mover Distance on Graph Metrics

Weighted SGD for ℓp Regression with Randomized Preconditioning

Contact Info

Product

Resources

About

Weighted SGD for ℓ_p Regression with Randomized Preconditioning