Protein splicing is a posttranslational modification where intervening proteins (inteins) cleave themselves from larger precursor proteins and ligate their flanking polypeptides (exteins) through a multistep chemical reaction. First thought to be an anomaly found in only a few organisms, protein splicing by inteins has since been observed in microorganisms from all domains of life. Despite this broad phylogenetic distribution, all inteins share common structural features such as a horseshoelike pseudo two-fold symmetric fold, several canonical sequence motifs, and similar splicing mechanisms. Intriguingly, the splicing efficiencies and substrate specificity of different inteins vary considerably, reflecting subtle changes in the chemical mechanism of splicing, linked to their local structure and dynamics. As intein chemistry has widespread use in protein chemistry, understanding the structural and dynamical aspects of inteins is crucial for intein engineering and the improvement of inteinbased technologies.Protein splicing is a posttranslational modification, a multistep autoprocessing event, where intervening proteins (inteins) self-excise themselves from the precursors followed by the ligation of external proteins (exteins) (1). Once thought anomalies, inteins have since been found in all domains of life. Beyond their biological context, inteins are particularly intriguing as their capacity to process the polypeptide backbone makes them useful tools for protein engineering. Despite the fact that all inteins share the same fold and have highly conserved sequence motifs in their active sites, inteins have surprisingly different splicing efficiencies, and their extein sequence preferences differ dramatically. Here, we review studies on intein structure and dynamics that shed light on their complex conserved fold and their divergent efficiency and substrate specificity.
The Intein FoldThe conserved horseshoe-like fold of inteins (Fig. 1A) has been seen in all intein structures solved by NMR and x-ray crystallography (2-11). The fold comprises primarily -sheets, loops, and two short helices and has pseudo two-fold symmetry (Fig. 1B). Given this symmetry, it has been proposed that this fold arose due to a gene duplication event of some parent protein (6). The intein fold has three remarkable features: 1) the topology is complex, involving multiple passes of the polypeptide chain back and forth between the symmetry-related halves (Fig. 1B); 2) the extein-bearing termini of the intein are brought in close proximity (Ͻ10 Å) for splicing; and 3) multiple protease-like active sites composed of conserved sequence motifs are built around these termini to carry out each of the chemical steps involved in protein splicing. Folding of an intein is coupled to the reaction; the initial structure facilitates the first step of splicing, and thereafter each splicing step causes local conformational changes, affecting the fold of the catalytic apparatus and hence affecting the reaction coordinate and shifting the equilibrium positi...