Kjell Konis scite author profile

Covariance and correlation estimates have important applications in data mining. In the presence of outliers, classical estimates of covariance and correlation matrices are not reliable. A small fraction of outliers, in some cases even a single outlier, can distort the classical covariance and correlation estimates making them virtually useless. That is, correlations for the vast majority of the data can be very erroneously reported; principal components transformations can be misleading; and multidimensional outlier detection via Mahalanobis distances can fail to detect outliers. There is plenty of statistical literature on robust covariance and correlation matrix estimates with an emphasis on affineequivariant estimators that possess high breakdown points and small worst case biases. All such estimators have unacceptable exponential complexity in the number of variables and quadratic complexity in the number of observations. In this paper we focus on several variants of robust covariance and correlation matrix estimates with quadratic complexity in the number of variables and linear complexity in the number of observations. These estimators are based on several forms of pairwise robust covariance and correlation estimates. The estimators studied include two fast estimators based on coordinate-wise robust transformations embedded in an overall procedure recently proposed by [14]. We show that the estimators have attractive robustness properties, and give an example that uses one of the estimators in the new Insightful Miner data mining product.

show abstract

Inference about the number of contributors to a DNA mixture: Comparative analyses of a Bayesian network approach and the maximum allele count method

Biedermann

Bozza

Konis

et al. 2012

Forensic Science International: Genetics

View full text Add to dashboard Cite

Sparse approximations of protein structure from noisy random projections

Panaretos¹,

Konis²

2011

Ann. Appl. Stat.

View full text Add to dashboard Cite

Single-particle electron microscopy is a modern technique that biophysicists employ to learn the structure of proteins. It yields data that consist of noisy random projections of the protein structure in random directions, with the added complication that the projection angles cannot be observed. In order to reconstruct a three-dimensional model, the projection directions need to be estimated by use of an ad-hoc starting estimate of the unknown particle. In this paper we propose a methodology that does not rely on knowledge of the projection angles, to construct an objective data-dependent low-resolution approximation of the unknown structure that can serve as such a starting estimate. The approach assumes that the protein admits a suitable sparse representation, and employs discrete $L^1$-regularization (LASSO) as well as notions from shape theory to tackle the peculiar challenges involved in the associated inverse problem. We illustrate the approach by application to the reconstruction of an E. coli protein component called the Klenow fragment.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS479 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

show abstract

Nonparametric Construction of Multivariate Kernels

Panaretos

Konis

2012

Journal of the American Statistical Association

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kjell Konis

Scalable robust covariance and correlation estimates for data mining

Scalable robust covariance and correlation estimates for data mining

Inference about the number of contributors to a DNA mixture: Comparative analyses of a Bayesian network approach and the maximum allele count method

Sparse approximations of protein structure from noisy random projections

Nonparametric Construction of Multivariate Kernels

Contact Info

Product

Resources

About