Manduca sexta serpin gene-1 encodes a family of serpins whose amino acid sequences are identical in their amino-terminal 336 residues but variable in their carboxyl-terminal 39 -46 residues, which includes the reactive site loop (Jiang, H., Wang, Y., and Kanost, M. R. (1994) J. Biol. Chem. 269, 55-58). Here, we report the gene's complete nucleotide sequence and exon-intron structure. A unique characteristic of this gene is its exon 9, which is present in 12 alternate forms between exons 8 and 10. Isolation and characterization of cDNA clones containing exons 9C, 9H, and 9I, which were not found previously, indicate that all 12 alternate forms of exon 9 can be utilized to generate 12 different serpins. The splicing pathway apparently allows inclusion of only one exon 9 per molecule of mature serpin-1 mRNA. Analysis of exon-intron border sequences reveals unique features that may be involved in regulation of RNA splicing. The exon 9 region has apparently evolved through rounds of exon duplication and sequence divergence. The exons near the center of the region may have evolved recently, whereas the outermost exons are the most ancient. Exons 9G and 9H were duplicated as a pair from exons 9E and 9F, an event that may have occurred more than once in the history of this gene.
The serpin1 superfamily contains a large number of proteins that function as inhibitors of serine proteinases as well as proteins related in sequence which are not inhibitors (1). Serpins are typically 370 -390 amino acid residues long, with a reactive site loop 30 -40 residues from the carboxyl terminus. This loop, exposed at the surface of the protein, is the site of interaction between serpins and the serine proteinases they inhibit. The serpin reactive site loop binds to the active site of a target proteinase in a manner similar to the binding of a substrate, forming a very stable serpin-proteinase complex. Formation of this complex involves a specific peptide bond in the reactive site loop, the scissile bond (designated P 1 -P 1 Ј). The amino acid sequence of the reactive site loop determines an inhibitor's selectivity. Altering the reactive site loop sequence, particularly at the P 1 position, can cause dramatic changes in the proteinase selectivity of a serpin (1). Comparisons of serpin sequences have demonstrated that the reactive site loop and adjacent sequence is the least conserved region of the proteins. It has been suggested that after duplications of serpin genes, rapid evolutionary change of the reactive site loop region provides new inhibitor selectivities that may have value during natural selection (2, 3).At least nine different serpins are present in mammalian plasma. They regulate the activity of serine proteinases involved in diverse physiological functions such as blood coagulation, fibrinolysis, complement activation, and inflammatory responses (1). Serpins have also been found in the hemolymph of invertebrates, including three groups of arthropods: insects, crayfish, and horseshoe crabs (4). These arthropod serpins have 1...