Simple repetitive DNA sequences are a widespread and abundant feature of genomic DNA. The following several features characterize such sequences: (1) they typically consist of a variety of repeated motifs of 1-10 bases--but may include much larger repeats as well; (2) larger repeat units often include shorter ones within them; (3) long polypyrimidine and poly-CA tracts are often found; and (4) tandem arrangements of closely related motifs are often found. We propose that slipped-strand mispairing events, in concert with unequal crossing-over, can readily account for all of these features. The frequent occurrence of long tandem repeats of particular motifs (polypyrimidine and poly-CA tracts) appears to result from nonrandom patterns of nucleotide substitution. We argue that the intrahelical process of slipped-strand mispairing is much more likely to be the major factor in the initial expansion of short repeated motifs and that, after initial expansion, simple tandem repeats may be predisposed to further expansion by unequal crossing-over or other interhelical events because of their propensity to mispair. Evidence is presented that single-base repeats (the shortest possible motifs) are represented by longer runs in mammalian introns than would be expected on a random basis, supporting the idea that SSM may be a ubiquitous force in the evolution of the eukaryotic genome. Simple repetitive sequences may therefore represent a natural ground state of DNA unselected for coding functions.
Slipped-strand mispairing (SSM) may play an major role in repetitive DNA sequence evolution by generating large numbers of short frameshift mutations within simple tandem repeats. Here we examine the frequency and size spectrum of frameshifts generated within poly-CA/TG sequences inserted into bacteriophage M13 in Escherichia coli hosts. The frequency of detectable frameshifts within a 40 bp tract of poly-CA/TG is greater than one percent and increases more than linearly with length, being lower by a factor of four in a 22 bp target sequence. The frequency increases more than 13-fold in mutL and mutS host cells, suggesting that a high proportion of frameshift events are normally repaired by methyl-directed mismatch repair. Of the 87 sequenced frameshifts in this study, 96% result from deletion or insertion of only or two 2 bp repeat units. The most frequent events are 2 bp deletions, 2 bp insertions, and 4 bp deletions, the relative frequencies of these events being about 18:6:1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.