Abstract.The concise representation of complex high dimensional stochastic systems via a few reduced coordinates is an important problem in computational physics, chemistry and biology. In this paper we use the first few eigenfunctions of the backward Fokker-Planck diffusion operator as a coarse grained low dimensional representation for the long term evolution of a stochastic system, and show that they are optimal under a certain mean squared error criterion. We denote the mapping from physical space to these eigenfunctions as the diffusion map. While in high dimensional systems these eigenfunctions are difficult to compute numerically by conventional methods such as finite differences or finite elements, we describe a simple computational data-driven method to approximate them from a large set of simulated data. Our method is based on defining an appropriately weighted graph on the set of simulated data, and computing the first few eigenvectors and eigenvalues of the corresponding random walk matrix on this graph. Thus, our algorithm incorporates the local geometry and density at each point into a global picture that merges in a natural way data from different simulation runs. Furthermore, we describe lifting and restriction operators between the diffusion map space and the original space. These operators facilitate the description of the coarse-grained dynamics, possibly in the form of a low-dimensional effective free energy surface parameterized by the diffusion map reduction coordinates. They also enable a systematic exploration of such effective free energy surfaces through the design of additional "intelligently biased" computational experiments. We conclude by demonstrating our method on a few examples.Key words. Diffusion maps, dimensional reduction, stochastic dynamical systems, Fokker Planck operator, metastable states, normalized graph Laplacian.
AMS subject classifications. 60H10, 60J60, 62M051. Introduction. Systems of stochastic differential equations (SDE's) are commonly used as models for the time evolution of many chemical, physical and biological systems of interacting particles [22,45,52]. There are two main approaches to the study of such systems. The first is by detailed Brownian Dynamics (BD) or other stochastic simulations, which follow the motion of each particle (or more generally variable) in the system and generate one or more long trajectories. The second is via analysis of the time evolution of the probability densities of these trajectories using the numerical solution of the corresponding time dependent Fokker-Planck (FP) partial differential equation.For typical high dimensional systems, both approaches suffer from severe limitations, when applied directly. The main limitation of standard BD simulations is the scale gap between the atomistic time scale of single particle motions, at which the SDE's are formulated, and the macroscopic time scales that characterize the long term evolution and equilibration of these systems. This scale gap puts severe constraints on detailed simulat...