Recruitment of the mRNA Capping Enzyme (CE/RNGTT) to the site of transcription is essential for the formation of the 5' mRNA cap, which in turn ensures efficient transcription, splicing, polyadenylation, nuclear export and translation of mRNA in eukaryotic cells. The CE is recruited and activated by the Serine-5 phosphorylated carboxyl-terminal domain (CTD) of RNA polymerase II. Through the use of molecular dynamics simulations and enhanced sampling techniques, we provide a systematic and detailed characterisation of the human CE-CTD interface, describing the effect of the CTD phosphorylation state, length and orientation on this interaction. Our computational analyses identify novel CTD interaction sites on the human CE surface and quantify their relative contributions to CTD binding. We also identify differences in the CTD binding conformation when phosphorylated at either the Serine-2 or Serine-5 positions, thus providing insights into how the CE reads the CTD code. The computational findings are then validated by binding and activity assays. These novel CTD interaction sites are compared with cocrystal structures of the CE-CTD complex in different eukaryotic taxa, leading to the conclusion that this interface is considerably more conserved than previous structures have indicated.
KeywordsmRNA Capping Enzyme, RNA Polymerase II C-terminal Domain, CTD code, MD simulation, protein-peptide interaction 1 2 Introduction mRNA capping is an essential process required for efficient gene expression and regulation in all eukaryotic organisms (1). The mRNA cap prevents degradation by 5'-exonucleases during transcription and acts as a platform to recruit initiation factors required for splicing, polyadenylation, nuclear export and translation (2-8). mRNA is capped at the 5'-end with an inverted 7methylguanosine moiety. This process occurs in three stages: i) the 5'-end triphosphate is hydrolysed to diphosphate; ii) GMP is covalently linked to the diphosphate 5' end; iii) the guanosine base is methylated at the N7 position (1). In animals the first two stages are performed by a bifunctional protein, the Capping Enzyme (CE/RNGTT), which contains triphosphatase (TPase) and guanylyltransferase (GTase) enzymatic domains separated by a disordered linker (9, 10). The mammalian CE GTase functions independently of the TPase domain (10-12). The final step, N7 methylation of the guanosine base, is performed by RNMT in complex with its activating mini-protein RAM (13,14).The process of mRNA capping is tightly coupled to transcription, occurring during the elongation phase (15,16). At this stage the CE is recruited to the site of transcription by the RNA polymerase II (Pol II) carboxyl-terminal domain (CTD) (17,18). The CTD is located in RPB1, the largest subunit of RNA Pol II, and is composed of a tandem repeated heptad motif with the consensus sequence Y 1 S 2 P 3 T 4 S 5 P 6 S 7 (19,20). This domain is disordered and can be dynamically phosphorylated at several positions to form a highly complex pattern known as the CTD phosphorylat...