The application of binary matrices are numerous. Representing a matrix as a mixture of a small collection of latent vectors via low-rank factorization is often seen as an advantageous method to interpret and analyze data. In this work, we examine the minimal rank factorizations of binary matrices using standard arithmetic (real and nonnegative) and logical operations (Boolean and Z2). We examine all the relationships between the different ranks, and discuss when the factorizations are unique. In particular, we characterize when a Boolean factorization X = W ∧ H has a unique W , a unique H (for a fixed W ), and when both W and H are unique, given a rank constraint.
A tensor provides a concise way to codify the interdependence of complex data. Treating a tensor as a d-way array, each entry records the interaction between the different indices. Clustering provides a way to parse the complexity of the data into more readily understandable information. Clustering methods are heavily dependent on the algorithm of choice, as well as the chosen hyperparameters of the algorithm. However, their sensitivity to data scales is largely unknown.In this work, we apply the discrete wavelet transform to analyze the effects of coarse-graining on clustering tensor data. We are particularly interested in understanding how scale effects clustering of the Earth's climate system. The discrete wavelet transform allows classification of the Earth's climate across a multitude of spatial-temporal scales. The discrete wavelet transform is used to produce an ensemble of classification estimates, as opposed to a single classification. Using information theory, we discover a sub-collection of the ensemble that span the majority of the variance observed, allowing for efficient consensus clustering techniques that can be used to identify climate biomes.
A novel approach to Boolean matrix factorization (BMF) is presented. Instead of solving the BMF problem directly, this approach solves a nonnegative optimization problem with an additional constraint over an auxiliary matrix whose Boolean structure is identical to the initial Boolean data. This additional auxiliary matrix constraint forces the support of the NMF solution to adhere to that of a BMF solution. The solution of the nonnegative auxiliary optimization problem is thresholded to provide a solution for the BMF problem. We provide the proofs for the equivalencies of the two solution spaces under the existence of an exact solution. Moreover, the nonincreasing property of the algorithm is also proven. Experiments on synthetic and real datasets are conducted to show the effectiveness and complexity of the algorithm compared to other current methods.
INDEX TERMS Boolean matrix factorization, nonnegative matrix factorization
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.