Low-rank Tensor Estimation via Riemannian Gauss-Newton: Statistical Optimality and Second-Order Convergence

Luo, Yuetian; Zhang, Anru

doi:10.48550/arxiv.2104.12031

Cited by 5 publications

(6 citation statements)

References 94 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[ARB20] proposed a tensor regression model where the tensor is simultaneously low-rank and sparse in the Tucker decomposition. A concurrent work [LZ21] proposed a Riemannian Gauss-Newton algorithm, and obtained an impressive quadratic convergence rate for tensor regression (see Table 2). Compared with ScaledGD, this algorithm runs in the tensor space, and the update rule is more sophisticated with higher per-iteration cost by solving a least-squares problem and performing a truncated HOSVD every iteration.…”

Section: Additional Related Workmentioning

confidence: 99%

“…To proceed, we need to control (U 0 , V 0 , W 0 ) • S 0 − X ⋆ F , where (U 0 , V 0 , W 0 ) • S 0 is the output of HOSVD, which has been considered in [LZ21,HWZ20,ZLRY20]. Invoking the result in [HWZ20, Appendix D.2…”

Section: D2 Proof Of Spectral Initialization (Lemma 5)mentioning

confidence: 99%

“…[LZ21, Theorem 3] states the sample complexity n 3/2 √ rκ 2 X ⋆ 2 F /σ 2 min (X ⋆), where X ⋆ 2 F /σ 2 min (X ⋆) has an order of rκ 2 .…”

mentioning

confidence: 99%

See 2 more Smart Citations

Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements

Tong¹,

Ma²,

Prater-Bennette³

et al. 2021

Preprint

View full text Add to dashboard Cite

Tensors, which provide a powerful and flexible model for representing multi-attribute data and multiway interactions, play an indispensable role in modern data science across various fields in science and engineering. A fundamental task is to faithfully recover the tensor from highly incomplete measurements in a statistically and computationally efficient manner. Harnessing the low-rank structure of tensors in the Tucker decomposition, this paper develops a scaled gradient descent (ScaledGD) algorithm to directly recover the tensor factors with tailored spectral initializations, and shows that it provably converges at a linear rate independent of the condition number of the ground truth tensor for two canonical problemstensor completion and tensor regression -as soon as the sample size is above the order of n 3/2 ignoring other parameter dependencies, where n is the dimension of the tensor. This leads to an extremely scalable approach to low-rank tensor estimation compared with prior art, which suffers from at least one of the following drawbacks: extreme sensitivity to ill-conditioning, high per-iteration costs in terms of memory and computation, or poor sample complexity guarantees. To the best of our knowledge, ScaledGD is the first algorithm that achieves near-optimal statistical and computational complexities simultaneously for low-rank tensor completion with the Tucker decomposition. Our algorithm highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation, where the iteration-varying preconditioners promote desirable invariance properties of the trajectory with respect to the underlying symmetry in low-rank tensor factorization.

show abstract

Section: Additional Related Workmentioning

confidence: 99%

Section: D2 Proof Of Spectral Initialization (Lemma 5)mentioning

confidence: 99%

See 1 more Smart Citation

Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements

Tong¹,

Ma²,

Prater-Bennette³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Tensor data are routinely employed in data and information sciences to model (structured) multi-dimensional objects [3,4,5,6,7,8,9]. In many practical scenarios of interest, however, we do not have full access to a large-dimensional tensor of interest, as only a sampling of its entries are revealed to us; yet we would still wish to reliably infer all missing data.…”

Section: A Noisy Low-rank Tensor Completionmentioning

confidence: 99%

Uncertainty Quantification for Nonconvex Tensor Completion: Confidence Intervals, Heteroscedasticity and Optimality

Cai

Poor

Chen

2023

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

We study the distribution and uncertainty of nonconvex optimization for noisy tensor completion-the problem of estimating a low-rank tensor given incomplete and corrupted observations of its entries. Focusing on a twostage estimation algorithm proposed by [2], we characterize the distribution of this nonconvex estimator down to fine scales. This distributional theory in turn allows one to construct valid and short confidence intervals for both the unseen tensor entries and the unknown tensor factors. The proposed inferential procedure enjoys several important features: (1) it is fully adaptive to noise heteroscedasticity, and (2) it is data-driven and automatically adapts to unknown noise distributions. Furthermore, our findings unveil the statistical optimality of nonconvex tensor completion: it attains un-improvable 2 accuracy-including both the rates and the pre-constants-when estimating both the unknown tensor and the underlying tensor factors.

show abstract

“…Broadly speaking, tensor RPCA concerns with reconstructing a high-dimensional tensor with certain low-dimensional structures from incomplete and corrupted observations. Pertaining to works that deal with the Tucker decomposition, [XY19] proposed a gradient descent based algorithm for tensor completion, [TMPB + 22,TMC22] proposed scaled gradient descent algorithms for tensor regression and tensor completion (which our algorithm also adopts), [LZ21] proposed a Gauss-Newton algorithm for tensor regression that achieves quadratic convergence, [WCW21] proposed a Riemannian gradient method with entrywise convergence guarantees, and [ARB20] studied tensor regression assuming the underlying tensor is simultaneously low-rank and sparse.…”

Section: Related Workmentioning

confidence: 99%

Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent

Harry¹,

Tong²,

Ma³

et al. 2022

Preprint

View full text Add to dashboard Cite

An increasing number of data science and machine learning problems rely on computation with tensors, which better capture the multi-way relationships and interactions of data than matrices. When tapping into this critical advantage, a key challenge is to develop computationally efficient and provably correct algorithms for extracting useful information from tensor data that are simultaneously robust to corruptions and ill-conditioning. This paper tackles tensor robust principal component analysis (RPCA), which aims to recover a low-rank tensor from its observations contaminated by sparse corruptions, under the Tucker decomposition. To minimize the computation and memory footprints, we propose to directly recover the low-dimensional tensor factors-starting from a tailored spectral initialization-via scaled gradient descent (ScaledGD), coupled with an iteration-varying thresholding operation to adaptively remove the impact of corruptions. Theoretically, we establish that the proposed algorithm converges linearly to the true low-rank tensor at a constant rate that is independent with its condition number, as long as the level of corruptions is not too large. Empirically, we demonstrate that the proposed algorithm achieves better and more scalable performance than state-of-the-art matrix and tensor RPCA algorithms through synthetic experiments and real-world applications.

show abstract

Low-rank Tensor Estimation via Riemannian Gauss-Newton: Statistical Optimality and Second-Order Convergence

Cited by 5 publications

References 94 publications

Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements

Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation from Incomplete Measurements

Uncertainty Quantification for Nonconvex Tensor Completion: Confidence Intervals, Heteroscedasticity and Optimality

Fast and Provable Tensor Robust Principal Component Analysis via Scaled Gradient Descent

Contact Info

Product

Resources

About