Transcription elongation can be modelled as a three step process, involving polymerase translocation, NTP binding, and nucleotide incorporation into the nascent mRNA. This cycle of events can be simulated at the single-molecule level as a continuous-time Markov process using parameters derived from single-molecule experiments. Previously developed models differ in the way they are parameterised, and in their incorporation of partial equilibrium approximations.We have formulated a hierarchical network comprised of 12 sequence-dependent transcription elongation models. The simplest model has two parameters and assumes that both translocation and NTP binding can be modelled as equilibrium processes. The most complex model has six parameters makes no partial equilibrium assumptions. We systematically compared the ability of these models to explain published force-velocity data, using approximate Bayesian computation. This analysis was performed using data for the RNA polymerases of E. coli, S. cerevisiae and Bacteriophage T7.Our analysis indicates that the polymerases differ significantly in their translocation rates, with the rates in T7 pol being fast compared to E. coli RNAP and S. cerevisiae pol II. Different models are applicable in different cases. We also show that all three RNA polymerases have an energetic preference for the posttranslocated state over the pretranslocated state. A Bayesian inference and model selection framework, like the one presented in this publication, should be routinely applicable to the interrogation of single-molecule datasets.
Author summaryTranscription is a critical biological process which occurs in all living organisms. It involves copying the organism's genetic material into messenger RNA (mRNA) which directs protein synthesis on the ribosome. Transcription is performed by RNA polymerases which have been extensively studied using both ensemble and single-molecule techniques (see reviews: [1,2]). Single-molecule data provides unique insights into the molecular behaviour of RNA polymerase. Transcription at the single-molecule level can be computationally simulated as a continuous-time Markov process and the model outputs compared with experimental data. In this study we use Bayesian techniques to perform a systematic comparison of 12 stochastic models of transcriptional elongation. We demonstrate how equilibrium approximations can strengthen or weaken the model, and show how Bayesian techniques can identify December 12, 2018 1/23 Introduction 1 Transcription is carried out by RNA polymerases: RNAP in Escherichia coli, pol II in 2 Saccharomyces cerevisiae, and T7 pol in Bacteriophage T7. It involves the copying of 3 template double-stranded DNA (dsDNA) into single-stranded messenger RNA (mRNA). 4The template is read in the 3 to 5 direction, while the mRNA is extended sequentially 5 in the 5 to 3 direction. RNAP and pol II are comprised of multiple subunits, and their 6 catalytic subunits are homologous [3,4]. In contrast, T7 pol exists as a monomer with a 7 distinct sequence, and ...