The frequent variations of human complement component C4 gene size and gene numbers, plus the extensive polymorphism of the proteins, render C4 an excellent marker for major histocompatibility complex disease associations. As shown by definitive RFLPs, the tandemly arranged genes RP, C4, CYP21, and TNX are duplicated together as a discrete genetic unit termed the RCCX module. Duplications of the RCCX modules occurred by the addition of genomic fragments containing a long (L) or a short (S) C4 gene, a CYP21A or a CYP21B gene, and the gene fragments TNXA and RP2. Four major RCCX structures with bimodular L-L, bimodular L-S, monomodular L, and monomodular S are present in the Caucasian population. These modules are readily detectable by TaqI RFLPs. The RCCX modular variations appear to be a root cause for the acquisition of deleterious mutations from pseudogenes or gene segments in the RCCX to their corresponding functional genes. In a patient with congenital adrenal hyperplasia, we discovered a TNXB-TNXA recombinant with the deletion of RP2-C4B-CYP21B. Elucidation of the DNA sequence for the recombination breakpoint region and sequence analyses yielded definitive proof for an unequal crossover between TNXA from a bimodular chromosome and TNXB from a monomodular chromosome.Besides the immunoglobulins, complement component C4 is probably the most polymorphic serum protein. There are two isotypes, C4A and C4B, that manifest remarkable differences in chemical reactivities and serological properties (reviewed in Ref. 1). More than 34 allotypes for C4A and C4B have been demonstrated by agarose gel electrophoresis, based on gross differences in electric charge (2). Similar to the protein, the complement C4 genes are unusually complex with frequent variations in gene size and gene number. In addition, the genes surrounding C4A or C4B also exhibit considerable variations. These neighboring genes include RP1 or RP2 at the 5Ј region, CYP21A, or CYP21B and TNXA or TNXB at the 3Ј region ( Fig.
The human complement C4 genes in the HLA exhibit an unusual, dichotomous size polymorphism and a four-gene, modular variation involving novel gene RP, complement C4, steroid 21-hydroxylase (CYP21), and tenascin-like Gene X (RCCX). The C4 gene size dichotomy is mediated by an endogenous retrovirus, HERV-K(C4). Nearly identical sequences for this retrotransposon are present precisely at the same location in the long C4 genes from the tandem RCCX Module I and Module II. Specific nucleotide substitutions between the long and short C4 genes have been identified and used for diagnosis. Southern blot analyses revealed that HERV-K(C4) is present at more than 30 locations in the human genome, exhibits variations in the population, and its analogs exist in the genomes of Old World primates with species-specific patterns. Evidence of intrachromosomal recombination between the two long terminal repeats of HERV-K(C4) is found near the huntingtin locus on chromosome 4. It is possible that members of HERV-K(C4) are involved in genetic instabilities including the RCCX modules, and in protecting the host genome from retroviral attack through an antisense strategy.
The complement component C4 genes of Old World primates exhibit a long/short dichotomous size variation, except that chimpanzee and gorilla only contain short C4 genes. In human it has been shown that the long C4 gene is attributed to the integration of an endogenous retrovirus, HERV-K(C4), into intron 9. This 6.36 kilobase retroviral element is absent in short C4 genes. Here it is shown that the homologous endogenous retrovirus, ERV-K(C4), is present precisely at the same position in the long C4 gene of orangutan and African green monkey. Determination of the short C4 gene intron 9 sequences from human, three apes, two Old World monkeys, and a New World monkey allowed the establishment of consistent phylogenetic trees for primates, which favors a chimpanzee-gorilla clade. The 5' long terminal repeats (LTR) and 3' LTR of ERV-K(C4) in long C4 genes of human, orangutan, and African green monkey have similar sequence divergence values of 9.1%-10.5%. These values are more than five-fold higher than the sequence divergence of the homologous intron 9 sequences between the long and short C4 genes in higher primates. The latter is probably a result of homogenization or concerted evolution. We suggest that the 5' LTR and 3' LTR of an endogenous retrovirus can serve as a reliable reference point or a molecular clock for studies of gene duplication and gene evolution. This is because the 5'/3' LTR sequences were identical at the time of retroviral integration and evolved independently of each other afterwards. Our data provides strong evidence for the short C4 gene being the ancestral form in primates, trans-species evolution, and the "slow-down" phenomenon of the sequence divergence in great apes.
Helicases are essential enzymes for life because DNA replication, DNA repair, recombination, transcription, RNA splicing and translation all involve more than one helicase to unwind DNA or RNA. We have discovered, cloned and partially characterized a novel human helicase gene, SKI2W. The human SKI2W is located between the RD and RP1 genes in the class III region of the major histocompatibility complex (MHC) on chromosome 6, a genomic region associated with many malignant, genetic and autoimmune diseases. Derived amino acid sequence of human SKI2W showed an open reading frame for 1246 residues. It contains consensus sequences for structural motifs of an RNA helicase with a DEVH box. It has a leucine zipper motif that may be important for protein dimerization, and an RGD motif close to the N-terminus that might serve as a ligand for integrin or cell adhesion molecules. SKI2W shares a striking and extensive similarity to the yeast Ski2p that is involved in the inhibition of translation of poly(A) negative [poly(A)-] RNA, and plays an important role in antiviral activities. Human SKI2W fusion protein expressed in insect cells using a baculovirus vector has ATPase activity. The human SKI2W protein and the yeast Ski2p share extensive sequence similarities to another putative human protein KIAA0052, suggesting the presence of a new gene family that may be involved in translational regulation of cellular and viral RNA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.