This work considers the problem of computing the canonical polyadic decomposition (CPD) of large tensors. Prior works mostly leverage data sparsity to handle this problem, which is not suitable for handling dense tensors that often arise in applications such as medical imaging, computer vision, and remote sensing. Stochastic optimization is known for its low memory cost and per-iteration complexity when handling dense data. However, exisiting stochastic CPD algorithms are not flexible enough to incorporate a variety of constraints/regularizations that are of interest in signal and data analytics. Convergence properties of many such algorithms are also unclear. In this work, we propose a stochastic optimization framework for large-scale CPD with constraints/regularizations. The framework works under a doubly randomized fashion, and can be regarded as a judicious combination of randomized block coordinate descent (BCD) and stochastic proximal gradient (SPG). The algorithm enjoys lightweight updates and small memory footprint. In addition, this framework entails considerable flexibility-many frequently used regularizers and constraints can be readily handled under the proposed scheme. The approach is also supported by convergence analysis. Numerical results on large-scale dense tensors are employed to showcase the effectiveness of the proposed approach.
Graph clustering is a core technique for network analysis problems, e.g., community detection. This work puts forth a node clustering approach for largely incomplete adjacency graphs. Under the considered scenario, instead of having access to the complete graph, only a small amount of queries about the graph edges can be made for node clustering. This task is well-motivated in many large-scale network analysis problems, where complete graph acquisition is prohibitively costly. Prior work tackles this problem under the setting that the nodes only admit single membership and the clusters are disjoint, yet multiple membership nodes and overlapping clusters often arise in practice. Existing approaches also rely on random edge query patterns and convex optimization-based formulations, which give rise to a number of implementation and scalability challenges. This work offers a framework that provably learns the mixed membership of nodes from overlapping clusters using limited edge information. Our method is equipped with a systematic edge query pattern, which is arguably easier to implement relative to the random counterparts in certain applications, e.g., field survey based graph analysis. A lightweight scalable algorithm is proposed, and its performance characterizations are presented. Numerical experiments are used to showcase the effectiveness of our method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.