Low-rank tensors are an established framework for high-dimensional least-squares problems. We propose to extend this framework by including the concept of block-sparsity. In the context of polynomial regression each sparsity pattern corresponds to some subspace of homogeneous multivariate polynomials. This allows us to adapt the ansatz space to align better with known sample complexity results. The resulting method is tested in numerical experiments and demonstrates improved computational resource utilization and sample efficiency.Keywords empirical L 2 approximation • sample efficiency • homogeneous polynomials • sparse tensor networks • alternating least squares
IntroductionAn important problem in many applications is the identification of a function from measurements or random samples. For this problem to be well-posed, some prior information about the function has to be assumed and a common requirement is that the function can be approximated in a finite dimensional ansatz space. For the purpose of extracting governing equations the most famous approach in recent years has been SINDy [BPK16]. However, the applicability of SINDy to high-dimensional problems is limited since truly high-dimensional problems require a nonlinear parameterization of the ansatz space. One particular reparametrization that has proven itself in many applications are tensor networks. These allow for a straight-forward extension of SINDy [GKES19] but can also encode additional structure as presented in [GRK + 20]. The compressive capabilities of tensor networks originate from this ability to exploit additional structure like smoothness, locality or self-similarity and have hence been used in solving high-dimensional equations [KK12, KS18, BK20, EPS16]. In the context of optimal control tensor train networks have been utilized for solving the Hamilton-Jacobi-Bellman equation in [DKK21, OSS20], for solving backward stochastic differential equations in [RSN21] and for the calculation of stock options prices in [BEST21, GKS20]. In the context of uncertainty quantification they are used in [ENSW19, ESTW19, ZYO + 15] and in the context of image classification they are used in [KG19, SS16]. A common thread in these publications is the parametrization of a high-dimensional ansatz space by a tensor train network which is then optimized. In most cases this means that the least-squares error of the parametrized function to the data is minimized. There exist many methods to perform this minimization. A well-known algorithm in the mathematics community is the alternating linear scheme (ALS) [Ose11a, HRS12a] , which is related to the famous DMRG method [Whi92] for solving the Schrödinger equation in Quantum Physics. Although, not directly suitable for recovery tasks, it became apparent that DMRG and ALS can be adapted to work in this context. Two of these extensions to the ALS algorithm are the stablilized ALS approximation (SALSA) [GK19] and the block alternating steepest descent for Recovery (bASD) algorithm [ENSW19]. Both adapt the tensor network ...