The Robust Perron Cluster Analysis (PCCA+) has become a popular algorithm for coarsegraining transition matrices of nearly decomposable Markov chains with transition states. Though originally developed for reversible Markov chains, it has been shown previously that PCCA+ can also be applied to cluster non-reversible Markov chains. However, the algorithm was implemented by assuming the dominant (target) eigenvalues to be real numbers. Therefore, the generalized Robust Perron Cluster Analysis (G-PCCA+) has recently been developed. G-PCCA+ is based on real Schur vectors instead of eigenvectors and can therefore be used to also coarse-grain transition matrices with complex eigenvalues. In its current implementation, however, G-PCCA+ is computationally expensive, which limits its applicability to large matrix problems.In this paper, we demonstrate that PCCA+ works in fact on any dominant invariant subspace of a nearly decomposable transition matrix, including both Schur vectors and eigenvectors. In particular, by separating the real and imaginary parts of complex eigenvectors, PCCA+ also works for transition matrices that have complex eigenvalues, including matrices with a circular transition pattern. We show that this separation maintains the invariant subspace, and that our version of the PCCA+ algorithm results in the same coarse-grained transition matrices as G-PCCA+, whereby PCCA+ is consistently faster in runtime than G-PCCA+. The analysis is performed in the Matlab programming language and codes are provided.
When estimating a probability density within the empirical Bayes framework, the nonparametric maximum likelihood estimate (NPMLE) tends to overfit the data. This issue is often taken care of by regularization -a penalty term is subtracted from the marginal loglikelihood before the maximization step, so that the estimate favors smooth densities. The majority of penalizations currently in use are rather arbitrary brute-force solutions, which lack invariance under reparametrization. This contradicts the principle that, if the underlying model has several equivalent formulations, the methods of inductive inference should lead to consistent results. Motivated by this principle and following an information-theoretic approach similar to the construction of reference priors, we suggest a penalty term that guarantees this kind of invariance. The resulting density estimate constitutes an extension of reference priors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.