We consider minimization of stochastic functionals that are compositions of a (potentially) non-smooth convex function h and smooth function c and, more generally, stochastic weakly-convex functionals. We develop a family of stochastic methods-including a stochastic proxlinear algorithm and a stochastic (generalized) sub-gradient procedure-and prove that, under mild technical conditions, each converges to first-order stationary points of the stochastic objective. We provide experiments further investigating our methods on non-smooth phase retrieval problems; the experiments indicate the practical effectiveness of the procedures.
We develop procedures, based on minimization of the composition f (x) = h(c(x)) of a convex function h and smooth function c, for solving random collections of quadratic equalities, applying our methodology to phase retrieval problems. We show that the prox-linear algorithm we develop can solve phase retrieval problems-even with adversarially faulty measurementswith high probability as soon as the number of measurements m is a constant factor larger than the dimension n of the signal to be recovered. The algorithm requires essentially no tuning-it consists of solving a sequence of convex problems-and it is implementable without any particular assumptions on the measurements taken. We provide substantial experiments investigating our methods, indicating the practical effectiveness of the procedures and showing that they succeed with high probability as soon as m/n ≥ 2 when the signal is real-valued.Algorithm 1: Prox-linear algorithm for problem (3) in Section 6 we provide experimental evidence of the success of our proposed approach. In reasonably high-dimensional settings (n ≥ 1000), with real-valued random Gaussian measurements our method achieves perfect signal recovery in about 80-90% of cases even when m/n = 2. The method also handles outlying measurements well, substantially improving state-of-the-art performance, and we give applications with measurement matrices that demonstrably fail all of our conditions but for which the method is still straightforward to implement and empirically successful.Notation We collect our common notation here. We let · and · 2 denote the usual vector 2 -norm, and for a matrix A ∈ C m×n , |||A||| op denotes its 2 -operator norm. The notation A H means the Hermitian conjugate (conjugate transpose) of A ∈ C m×n . For a ∈ C, Re(a) denotes its real part and Im(a) its imaginary part. We take ·, · to be the standard inner product on whatever space it applies; for, the αth quantile linearly interpolates c ( mα ) and c ( mα ) . For a random variable X, quant α (X) denotes its αth quantile.
In this paper, we recover sparse signals from their noisy linear measurements by solving nonlinear differential inclusions, which is based on the notion of inverse scale space (ISS) developed in applied mathematics. Our goal here is to bring this idea to address a challenging problem in statistics, i.e. finding the oracle estimator which is unbiased and sign-consistent using dynamics. We call our dynamics Bregman ISS and Linearized Bregman ISS. A well-known shortcoming of LASSO and any convex regularization approaches lies in the bias of estimators. However, we show that under proper conditions, there exists a bias-free and sign-consistent point on the solution paths of such dynamics, which corresponds to a signal that is the unbiased estimate of the true signal and whose entries have the same signs as those of the true signs, i.e. the oracle estimator. Therefore, their solution paths are regularization paths better than the LASSO regularization path, since the points on the latter path are biased when sign-consistency is reached. We also show how to efficiently compute their solution paths in both continuous and discretized settings: the full solution paths can be exactly computed piece by piece, and a discretization leads to Linearized Bregman iteration, which is a simple iterative thresholding rule and easy to parallelize. Theoretical guarantees such as sign-consistency and minimax optimal l2-error bounds are established in both continuous and discrete settings for specific points on the paths. Early-stopping rules for identifying these points are given. The key treatment relies on the development of differential inequalities for differential inclusions and their discretizations, which extends the previous results and leads to exponentially fast recovering of sparse signals before selecting wrong ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.