CONSPECTUS
RNA polymerase II (Pol II) is an essential enzyme that catalyzes transcription with high efficiency and fidelity in eukaryotic cells. During transcription elongation, Pol II catalyzes the nucleotide addition cycle (NAC) to synthesize messenger RNA using DNA as the template. The transitions between the states of the NAC require conformational changes of both the protein and nucleotides. Although X-ray structures are available for most of these states, the dynamics of the transitions between states are largely unknown. Molecular dynamics (MD) simulations can predict structure-based molecular details and shed light on the mechanisms of these dynamic transitions. However, the employment of MD simulations on a macromolecule (tens to hundreds of nanoseconds) such as Pol II is challenging due to the difficulty of reaching biologically relevant timescales (tens of microseconds or even longer). To overcome this challenge, kinetic network models (KNMs) such as Markov State Models (MSMs) have become a popular approach to assess long-timescale conformational changes using many short MD simulations.
We describe here our application of KNMs to characterize the molecular mechanisms of the NAC of Pol II. First, we introduce the general background of MSMs and further explain the procedures for the construction and validation of MSMs by providing some technical details. Next, we give an outline of our previous studies in which we applied MSMs to investigate the individual steps of the NAC, including translocation and pyrophosphate ion release. We make a summary of the major findings for each of these MSM applications. Furthermore, we describe in detail how to build the structural models, the procedures to generate conformations for seeding MD simulations and the parameters used to construct MSMs for each of the application we present. Finally, in order to study the overall NAC, we combine the individual steps of the NAC into a five-state KNM based on a non-branched Brownian ratchet scheme to explain the single-molecule optical tweezers experimental data. In the description of the KNM application, we explicitly write out the underlying assumptions of the five-state KNM and discuss the open questions and future studies that can help us refine the KNMs. The studies we discuss in the review complement experimental observations and provide molecular mechanisms for the transcription elongation cycle. In the long term, incorporation of sequence-dependent kinetic parameters into KNMs has a great potential for identifying error-prone sequences and predicting transcription dynamics in genome-wide transcriptomes.