For most eukaryotic species, the centromere is comprised of millions of base pairs of tandemly repeated
deoxyribonucleic acid (DNA)
sequences. Centromere function is broadly conserved across eukaryotic phyla, yet centromere DNA presents several unique conundrums for biologists, further complicated by the challenges in studying highly repeated regions of complex genomes. Contrary to the expectation that centromeric sequences would be constrained to maintain centromere function across species, these sequences are among the most rapidly evolving sequences in any given genome. This discordance between functional constraint and sequence divergence, termed the ‘centromere paradox’, appears to defy basic laws of Mendelian inheritance. Multiple genetic mechanisms have been proposed to explain centromeric DNA complexity and rapid evolutionary divergence, taking into consideration the unique chromosome architecture and dynamics of the centromere during both mitosis and meiosis. Stochastic processes affecting sequence evolution and the selective constraint necessary for centromere protein recognition are balanced in an ongoing conflict that ultimately manifests as rapid centromere DNA evolution.
Key Concepts:
Loss of centromere function does not equate to loss of centromere sequence. Conversely, centromere sequences do not strictly demarcate a functional centromere.
The constitution of centromeric satellite DNAs across species differs not only in sequence, but also in the repeat unit length, abundance of the repeat unit and the complexity of the genomic structure of multiple repeat units.
The more commonly found regional centromere is typified by a distinct, linear organisation of a given satellite repeat unit into high‐copy tandem repeats of that unit than can span megabases of DNA.
The high homology of the higher‐order satellite repeats within a centromere is consistent with their function to effectively bind centromere proteins, wherein selection may favour homogeneity to retain centromere function.
Initial forms of an active centromere do not necessarily require HOR satellite arrays, but rather such arrays evolve over evolutionary timescales following stable establishment and inheritance of a new centromere.
The process of molecular drive, involving concerted evolution and gene conversion results in the homogenisation and fixation of a given repeat variant and may lead to the convergent and concerted evolution of satellites within one species.
Genetic conflict and/or meiotic drive may be responsible for the different centromere satellite sequence suites found between species.
Centromere Drive may be responsible for the rapid divergence of functional centromeric sequences between species.
The coding sequences for two centromere proteins, CENP‐A (centromere‐specific histone H3) and CENP‐C (a centromere‐specific DNA binding protein), evolve at rates faster than expected for either neutral (unconstrained) or purifying (selectively constrained) evolution in many species.
Mobile elements can impact the rate at which satellites are derived, expand, contract and homogenise within a species.