Many results have been proved for various nuclear norm penalized estimators of the uniform sampling matrix completion problem. However, most of these estimators are not robust: in most of the cases the quadratic loss function and its modifications are used. We consider robust nuclear norm penalized estimators using two well-known robust loss functions: the absolute value loss and the Huber loss. Under several conditions on the sparsity of the problem (i.e. the rank of the parameter matrix) and on the regularity of the risk function sharp and non-sharp oracle inequalities for these estimators are shown to hold with high probability. As a consequence, the asymptotic behavior of the estimators is derived. Similar error bounds are obtained under the assumption of weak sparsity, i.e. the case where the matrix is assumed to be only approximately low-rank. In all our results we consider a high-dimensional setting. In this case, this means that we assume n ≤ pq. Finally, various simulations confirm our theoretical results.MSC 2010 subject classifications: Primary 62J05, 62F30; secondary 62H12.
Supervised learning methods with missing data have been extensively studied not just due to the techniques related to low‐rank matrix completion. Also, in unsupervised learning, one often relies on imputation methods. As a matter of fact, missing values induce a bias in various estimators such as the sample covariance matrix. In the present paper, a convex method for sparse subspace estimation is extended to the case of missing and corrupted measurements. This is done by correcting the bias instead of imputing the missing values. The estimator is then used as an initial value for a nonconvex procedure to improve the overall statistical performance. The methodological and theoretical frameworks are applied to a wide range of statistical problems. These include sparse principal component analysis with different types of randomly missing data. Finally, the statistical performance is demonstrated on synthetic data.
Many statistical estimation procedures lead to nonconvex optimization problems. Algorithms to solve these are often guaranteed to output a stationary point of the optimization problem. Oracle inequalities are an important theoretical instrument to asses the statistical performance of an estimator. Oracle results have focused on the theoretical properties of the uncomputable (global) minimum or maximum. In the present work a general framework used for convex optimization problems to derive oracle inequalities for stationary points is extended. A main new ingredient of these oracle inequalities is that they are sharp: they show closeness to the best approximation within the model plus a remainder term. We apply this framework to different estimation problems.
In this work we describe a highly automated procedure ('workflow') for the analysis of electronic and molecular structure data obtained from quantum chemical computations. The data generated as part of this workflow are archived in an XML/CML database. These data are processed by means
of statistical analysis. This production and analysis machinery is applied towards the interference of dependencies between the electron delocalization and the properties of functionalized linearly ?-conjugated compounds. This information is the source for the generation of rules or knowledge
applicable in the rational design of functional materials.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.