We found that the genes for human U2 small nuclear RNA (snRNA) are organized as a nearly perfect tandem array of 10 to 20 copies per haploid genome. Although the coding region for the mature form of U2 RNA was only 188 base pairs (bp) long, the basic repeating unit of the tandem array was 6 kilobase pairs in length. Comparison of DNA sequences immediately upstream from human Ul and U2 genes revealed two regions of strong homology: region I (15 bp long) lay upstream of region II (20 bp long) and was separated from it by about the same distance in Ul genes (25 bp) as in U2 genes (21 bp); however, region I and region II were located 174 bp further upstream from the 5' end of the snRNA coding sequence in Ul genes than in U2 genes. Homologs of region II were also found upstream of the snRNA coding region in a mouse U2 gene and two rat Ul genes. Murphy et al. (Cell 29:265-274, 1982) U2 RNA is an abundant small nuclear RNA (snRNA) found in both plant and animal cells (5). The primary sequence of this snRNA species has been highly conserved through evolution as judged by nucleotide sequence analysis of U2 RNA from rat (30), chicken, pheasant (4), and wheat embryo (J. M. Skuzeski, personal communication), fingerprint analysis of U2 RNA from mouse and man (17,26), and DNA sequence analysis of a gene for mouse (27) and frog U2 snRNA (20). U2, like Ul, has a 5' terminal 2,2,7-trimethylguanosine cap structure, and both Ul and U2 appear to be transcribed by RNA polymerase 11 (5, 25). U2 RNA, as well as the snRNAs Ul, U4, U5, and U6 exist as components of small nuclear ribonucleoprotein particles (13,14,17). Indirect evidence suggests that Ul small nuclear ribonucleoproteins play a role in the nuclear splicing of mRNA precursors (16,23), and a similar role for U2 small nuclear ribonucleoproteins has been proposed (28; but see reference 4).The initial characterization of the multigene families for human Ul, U2, U3, U4, and U6 snRNAs has established that most of the human chromosomal loci complementary to these snRNAs are actually defective gene copies, or pseudogenes, which appear to be dispersed in the genome (2,8,9,11,12,19,22,36,41 Fig. 2A and C) was labeled in vitro with [c-32P]dATP by primed synthesis of the complementary strand.Blotting procedures. Genomic blots were prepared by the method of Southern (34) except that the DNA was depurinated by soaking the gel in 0.25 N HCl before denaturation (37).on May 10, 2018 by guest