SignificanceModern large-scale data analysis and machine learning applications rely critically on computationally efficient algorithms. There are 2 main classes of algorithms used in this setting—those based on optimization and those based on Monte Carlo sampling. The folk wisdom is that sampling is necessarily slower than optimization and is only warranted in situations where estimates of uncertainty are needed. We show that this folk wisdom is not correct in general—there is a natural class of nonconvex problems for which the computational complexity of sampling algorithms scales linearly with the model dimension while that of optimization algorithms scales exponentially.
We revisit a pioneer unsupervised learning technique called archetypal analysis [5], which is related to successful data analysis methods such as sparse coding [18] and non-negative matrix factorization [19]. Since it was proposed, archetypal analysis did not gain a lot of popularity even though it produces more interpretable models than other alternatives. Because no efficient implementation has ever been made publicly available, its application to important scientific problems may have been severely limited.Our goal is to bring back into favour archetypal analysis. We propose a fast optimization scheme using an activeset strategy, and provide an efficient open-source implementation interfaced with Matlab, R, and Python. Then, we demonstrate the usefulness of archetypal analysis for computer vision tasks, such as codebook learning, signal classification, and large image collection visualization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.