Most eukaryotic messenger RNA precursors (pre-mRNAs) undergo extensive maturational processing, including 3'-end cleavage and polyadenylation [1][2][3][4][5][6][7][8] . Despite the characterization of a large number of proteins that are required for the cleavage reaction, the identity of the endoribonuclease is not known 4,9,10 . Recent analyses suggested that the 73 kD subunit of cleavage and polyadenylation specificity factor (CPSF-73) may be the endonuclease for this and related reactions [10][11][12][13][14][15] , although no direct data confirmed this. Here we report the crystal structures of human CPSF-73 at 2.1 Å resolution, complexed with zinc ions and a sulfate that may mimic the phosphate group of the substrate, and the related yeast protein CPSF-100 (Ydh1p) at 2.5 Å resolution. Both CPSF-73 and CPSF-100 contain two domains, a metallo-β-lactamase domain and a novel β-CASP domain. The active site of CPSF-73, with two zinc ions, is located at the interface of the two domains. Purified recombinant CPSF-73 possesses endoribonuclease activity, and mutations that disrupt zinc binding in the active site abolish this activity. Our studies provide the first direct experimental evidence that CPSF-73 is the pre-mRNA 3'-end processing endonuclease.
Keywordspolyadenylation; metallo-β-lactamase; pre-mRNA processing; Artemis; V(D)J recombination; double-strand break repair CPSF-73 belongs to the metallo-β-lactamase superfamily of zinc-dependent hydrolases 11,12 . Canonical metallo-β-lactamases contain five signature sequence motifs-Asp (motif 1), His-X-His-X-Asp-His (motif 2), His (motif 3), Asp (motif 4) and His (motif 5), most of which are ligands to the two zinc ions in their active site. Sequence conservation between CPSF-73 and the canonical metallo-β-lactamases is limited to these signature motifs. While the first four motifs can be identified in the N-terminal segment of CPSF-73 (Supplemental Fig. 1a, Supplemental Table 1), the fifth motif was uncertain, with three candidates, A (Asp or Glu), B (His), and C (His) (Supplemental Fig. 1a), in the so-called β-CASP motif 12 . Motif B was proposed to be equivalent to motif 5 in the canonical metallo-β-lactamases. Another subunit of CPSF, CPSF-100, shares sequence conservation (Supplemental Fig. 1b) Fig. 1a) with CPSF-73 but lacks the putative Zn 2+ binding residues.To understand the roles of CPSF-73 and CPSF-100 in pre-mRNA 3'-end processing, we determined the structures of human CPSF-73 (residues 1-460), and yeast CPSF-100 (residues 1-720) (the crystallographic data are summarized in Supplemental Table 2). The two structures obtained for CPSF-73 were crystallized in the absence or presence of 0.5 mM zinc (although both structures contained zinc atoms; see below). We discovered serendipitously that in situ proteolysis by a fungal protease is crucial for the crystallization of yeast CPSF-100 16 .The structure of CPSF-73 can be divided into two domains (Fig. 1a). The N-terminal residues (amino acids 1-208) form a domain similar to the structure of canonical me...