SCAN is a protein domain frequently found at the N termini of proteins encoded by mammalian tandem zinc finger (ZF) genes, whose structure is known to be similar to that of retroviral gag capsid domains and whose multimerization has been proposed as a model for retroviral assembly. We report that the SCAN domain is derived from the C-terminal portion of the gag capsid (CA) protein from the Gmr1-like family of Gypsy/Ty3-like retrotransposons. On the basis of sequence alignments and phylogenetic distributions, we show that the ancestral host SCAN domain (ESCAN for extended SCAN) was exapted from a full-length CA gene from a Gmr1-like retrotransposon at or near the root of the tetrapod animal branch. A truncated variant of ESCAN that corresponds to the annotated SCAN domain arose shortly thereafter and appears to be the only form extant in mammals. The Anolis lizard has a large number of tandem ZF genes with N-terminal ESCAN or SCAN domains. We predict DNA binding sites for all Anolis ESCAN-ZF and SCAN-ZF proteins and demonstrate several highly significant matches to Anolis Gmr1-like sequences, suggesting that at least some of these proteins target retroelements. SCAN is known to mediate protein dimerization, and the CA protein multimerizes to form the core retroviral and retrotransposon capsid structure. We speculate that the SCAN domain originally functioned to target host ZF proteins to retroelement capsids.
Gmr1-like elements are a class of Ty3/Gypsy long terminal repeat (LTR) retrotransposons similar to most other Ty3/ Gypsy elements but with a different protein domain order within the pol gene; the domain order of Gmr1-like elements is Ty1/copia-like, PRO-INT-RT-RNH (protease [PRO]-integrase [INT]-reverse transcriptase [RT]-RNase H [RNH]), rather than the typical Gypsy order PRO-RT-RNH-INT (15).Gmr1-like elements have an origin before the common ancestor of deuterostomes (vertebrates along with sea urchins, etc.) (15). In this study, we investigate the relationships of the capsid structural proteins encoded by Gmr1-like elements and the SCAN domain present in amniotes (mammals, birds, and reptiles). It has previously been noticed and briefly remarked upon that in lower vertebrates, sequences matching a SCAN domain profile reside in retrovirus-like polyproteins (36). In addition, protein structural similarity between the SCAN domain and the HIV C-terminal capsid has been observed. The SCAN domain has been shown to multimerize by a domainswapping mechanism in which two monomers swap their major homology regions (MHRs), and it has been hypothesized that domain swapping in this fashion plays a role in retroviral assembly (23, 26). We speculate that the protein-protein interaction function of the SCAN domain was borrowed from the capsid domain of a Gmr1-like element, which multimerizes in vivo to form the core retrotransposon capsid structure.The SCAN domain is a conserved motif of approximately 80 amino acids found at the N termini of many Cys2His2 (C2H2)-type zinc finger (ZF) proteins and is leucine rich and dom...