We introduce a model of amino acid sequence evolution that accounts for the statistical behavior of real sequences induced by epistatic interactions. We base the model dynamics on parameters derived from multiple sequence alignments analyzed by using direct coupling analysis methodology. Known statistical properties such as overdispersion, heterotachy, and gamma-distributed rate-across-sites are shown to be emergent properties of this model while being consistent with neutral evolution theory, thereby unifying observations from previously disjointed evolutionary models of sequences. The relationship between site restriction and heterotachy is characterized by tracking the effective alphabet dynamics of sites. We also observe an evolutionary Stokes shift in the fitness of sequences that have undergone evolution under our simulation. By analyzing the structural information of some proteins, we corroborate that the strongest Stokes shifts derive from sites that physically interact in networks near biochemically important regions. Perspectives on the implementation of our model in the context of the molecular clock are discussed.
Multilayer networks continue to gain significant attention in many areas of study, particularly due to their high utility in modeling interdependent systems such as critical infrastructures, human brain connectome, and socioenvironmental ecosystems. However, clustering of multilayer networks, especially using the information on higher-order interactions of the system entities, still remains in its infancy. In turn, higher-order connectivity is often the key in such multilayer network applications as developing optimal partitioning of critical infrastructures in order to isolate unhealthy system components under cyber-physical threats and simultaneous identification of multiple brain regions affected by trauma or mental illness. In this paper, we introduce the concepts of topological data analysis to studies of complex multilayer networks and propose a topological approach for network clustering. The key rationale is to group nodes based not on pairwise connectivity patterns or relationships between observations recorded at two individual nodes but based on how similar in shape their local neighborhoods are at various resolution scales. Since shapes of local node neighborhoods are quantified using a topological summary in terms of persistence diagrams, we refer to the approach as clustering using persistence diagrams (CPD). CPD systematically accounts for the important heterogeneous higher-order properties of node interactions within and in-between network layers and integrates information from the node neighbors. We illustrate the utility of CPD by applying it to an emerging problem of societal importance: vulnerability zoning of residential properties to weather- and climate-induced risks in the context of house insurance claim dynamics.
We introduce a novel geometry-oriented methodology, based on the emerging tools of topological data analysis, into the change-point detection framework.The key rationale is that change points are likely to be associated with changes in geometry behind the data-generating process. While the applications of topological data analysis to change-point detection are potentially very broad, in this paper, we primarily focus on integrating topological concepts with the existing nonparametric methods for change-point detection. In particular, the proposed new geometry-oriented approach aims to enhance detection accuracy of distributional regime shift locations. Our simulation studies suggest that integration of topological data analysis with some existing algorithms for change-point detection leads to consistently more accurate detection results. We illustrate our new methodology in application to the two closely related environmental time series data sets-ice phenology of the Lake Baikal and the North Atlantic Oscillation indices, in a research query for a possible association between their estimated regime shift locations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.