When applying molecular dynamics simulations, we aim to understand biomolecular processes. Ideally, our understanding must build on statistically robust scientific observations. The key observables of interest:1. Important structures, 2. their thermodynamic weights, 3. and the transition probabilities amongst them, or their inter-conversion rates.Robust identification of these three properties allows for MD results' direct connection to experimental data, including NMR spectroscopy and sm-FRET [33][34][35][36]. Comparisons such as these may serve as an important complementary means of validating the simulation models and can help drive robust scientific hypotheses and models.Analysis of MD simulations, however, often relies on visually inspecting simulation trajectories one-by-one. Alternatively, we follow the simulation trajectories projected onto a few order parameters (or collective variables) derived from chemical intuition about the process of interest or some global structural property [37][38][39][40][41]. Inspecting structures and following certain order parameters is an integral part of any analysis of molecular dynamics simulations. However, these strategies alone do not guarantee a statistical relevance of events observed, and the overall approach becomes increasingly time-consuming with growing data-sets. Furthermore, limiting ourselves to these analyses may still overlook rare events important for biological function. So ultimately, conclusions drawn from these kinds of analyses may be misleading [30].Statistical models to analyze data from MD simulations are enjoying increased attention in recent years [42][43][44][45][46][47][48][49][50]. This popularity is a necessary consequence of growing datasets enabled by improvements in software efficiency and large-scale investment into consumer-grade GPU (graphical processing units) based compute resources by many academic groups. Another important factor is community-driven, cloud-based super-computers such as Folding@Home [51] and GPUgrid (www.gpugrid.net) that generate enormous volumes of simulation data whose analysis critically relies on a systematic and principled framework. Markov state models (MSM) are one prominent example of statistical models for analyzing molecular dynamics simulation, which fits the bill [30,42,44,52].This section will briefly discuss the motivation and theoretical basis of MSMs and some important mathematical properties of MSM, motivating subsequent sections. With this text, I do not attempt to discuss these topics comprehensively but instead, provide a guiding primer into the following sections and enable the reader to build some intuition about the theory -in general, the text is based upon the references cited in this section. However, I intentionally minimize technical language and equations and avoid specific details in the notation for clarity. For a more detailed MSM theory treatment, I refer to the excellent review by Prinz et al. [30]. For a more comprehensive historical overview of MSMs, I refer to Brooke and Pande'...