A highly abundant repetitive DNA sequence family of Arabidopsis, AtCon, is composed of 178-bp tandemly repeated units and is located at the centromeres of all five chromosome pairs. Analysis of multiple copies of AtCon showed 95% conservation of nucleotides, with some alternative bases, and revealed two boxes, 30 and 24 bp long, that are 99% conserved. Sequences at the 3 end of these boxes showed similarity to yeast CDEI and human CENP-B DNA-protein binding motifs. When oligonucleotides from less conserved regions of AtCon were hybridized in situ and visualized by using primer extension, they were detected on specific chromosomes. When used for polymerase chain reaction with genomic DNA, single primers or primer pairs oriented in the same direction showed negligible amplification, indicating a head-to-tail repeat unit organization. Most primer pairs facing in opposite directions gave several strong bands corresponding to their positions within AtCon. However, consistent with the primer extension results, some primer pairs showed no amplification, indicating that there are chromosome-specific variants of AtCon. The results are significant because they elucidate the organization, mode of amplification, dispersion, and evolution of one of the major repeated sequence families of Arabidopsis. The evidence presented here suggests that AtCon, like human ␣ satellites, plays a role in Arabidopsis centromere organization and function.
INTRODUCTIONUnderstanding the organization, sequence, structure, and function of the DNA at the centromeres of chromosomes of fungi, plants, and animals is of both fundamental and applied importance because of the role of the centromere in chromosome segregation, in karyotypic stability, and in generating artificial chromosomes as cloning or expression vectors. In yeasts, the role of particular sequences and their protein interactions is becoming clear (Uzawa and Yanagida, 1992;Clarke et al., 1993;Hegemann and Fleig, 1993). However, despite these successes, the sequence requirements for the formation of a mammalian centromere remain unclear (Kipling and Warburton, 1997). In plants, various repetitive sequences have been isolated, and in situ hybridization with mitotic chromosome preparations has shown them to localize in broad centromeric regions (see below); however, little more is known.Core centromeric DNA sequences and the flanking repetitive DNA motifs that are essential for function when reintroduced into the cell have been isolated from both budding yeast ( Saccharomyces cerevisiae ) and fission yeast ( Schizosaccharomyces pombe ), although the difference in average length of the centromere-associated DNA between the two species is enormous (Hegemann and Fleig, 1993). A functional centromere of S. cerevisiae is contained within a 125-bp sequence that carries three types of relatively simple protein binding DNA elements (see Clarke, 1990): CDEI (for centromere DNA element I), consisting of eight nucleotides (RTCACRTG, where R is a purine); CDEII, an AT-rich 78-to 86-nucleotide sequence...