Estimating the generator of a continuous-time Markov jump process based on incomplete data is a problem which arises in various applications ranging from machine learning to molecular dynamics. Several methods have been devised for this purpose: a quadratic programming approach (cf. . Some of these methods, however, seem to be known only in a particular research community, and have later been reinvented in a different context. The purpose of this paper is to compile a catalogue of existing approaches, to compare the strengths and weaknesses, and to test their performance in a series of numerical examples. These examples include carefully chosen model problems and an application to a time series from molecular dynamics.
We present a novel method for the identification of the most important metastable states of a system with complicated dynamical behavior from time series information. The novel approach represents the effective dynamics of the full system by a Markov jump process between metastable states and the dynamics within each of these metastable states by rather simple stochastic differential equations (SDEs). Its algorithmic realization exploits the concept of hidden Markov models with output behavior given by SDEs. The numerical effort of the method is linear in the length of the given time series and quadratic in terms of the number of metastable states. The performance of the resulting method is illustrated by numerical tests and by application to molecular dynamics time series of a trialanine molecule.
This article is a survey of the present state of the transfer operator approach to the effective dynamics of metastable complex systems, and the variety of algorithms associated with it. We introduce new methods, and we emphasize both the conceptional foundations and the concrete application to the conformation dynamics of a biomolecular system. The algorithmic aspects are illustrated by means of several examples of various degrees of complexity, culminating in their application to a full-scale molecular dynamics simulation of a B-DNA oligomer.
Introduction.With the increasing availability of ever more powerful computational resources, there is current interest in performing long numerical simulations of large nonlinear dynamical systems, for example, biomolecules, and in examining rather detailed properties of the results. For example, there is a current effort to understand the sequence-dependent physical properties of B-form DNA via the construction and analysis of a self-consistent data base of 39 compatible simulations, each of a 15 base pair fragment or oligomer, with the oligomers constructed in such a way that each of the 136 possible independent tetramer sequences is present at least twice [5]. The time series generated in this particular project comprise more than half a terabyte of data. It is accordingly evident that there is an ever increasing need to analyze such time series efficiently, with mathematical algorithms that are practical for data sets of this order of magnitude. In particular, many nonlinear dynamical systems, including biomolecules and specifically DNA, exhibit the phenomenon of metastability; i.e., the trajectory is localized in one subregion of phase space for comparatively long time scales, before undergoing a rapid and rare transition to another region, where it then stays for a comparatively long residency time before eventually undergoing another rapid transition, and so on. Figure 10 illustrates this phenomenon via a plot of a single, scalar-dependent variable, in this
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.