Recently, an enrichment of identical matching sequences has been found in many eukaryotic genomes. Their length distribution exhibits a power law tail raising the question of what evolutionary mechanism or functional constraints would be able to shape this distribution. Here we introduce a simple and evolutionarily neutral model, which involves only point mutations and segmental duplications, and produces the same statistical features as observed for genomic data. Further, we extend a mathematical model for random stick breaking to analytically show that the exponent of the power law tail is À3 and universal as it does not depend on the microscopic details of the model. DOI: 10.1103/PhysRevLett.110.148101 PACS numbers: 87.10.Vg, 87.10.Ca, 87.18.Wd, 87.23.Kg Ever since Susumu Ohno wrote his influential book to highlight the role of gene duplication in evolution [1], it has been well recognized that duplication and subsequent change of genetic material allow the exploration of evolutionary trajectories not accessible by point mutations only. Having completed the sequencing of the human genome, we know today that about 5% of primate genomes are composed of so-called segmental duplications often spanning tens of kbp [2,3]. The majority of those duplications are thought to have no direct function. In contrast to the very rich discussion about ''the evolutionary fate and consequences of duplicated genes' ' [4], the destiny of duplicated nonfunctional DNA segments is in the majority of cases clear: they will dissolve into the genomic background by random mutations. However, as we show in this Letter, this dispersion process generates interesting statistical properties of the length distribution of identical sequence segments in genomes, which exhibits scale invariance with an integer exponent. We argue that this distribution is the characteristic mark of processes that are continuous and perpetual on evolutionary time scales and generate segmental duplications of genetic material and disperse them by random mutations into the genomic background.Just after its duplication, a duplicated sequence segment will start out 100% identical to its original; subsequently random nucleotide substitutions and small scale insertions or deletions will break it into two and then more pieces, each being still identical to the corresponding segment in the original. This dispersion process can easily be observed in sequenced genomes when considering maximal segments of exactly matching nucleotides, i.e., copies of sequence segments that are equal over their entire length but differ on both ends. Such identical matches can easily be found using, for example, a gapless local alignment algorithm with infinite mismatch costs [5]. More advanced techniques employing suffix trees [6] or word counts are considerably faster for counting long and short segments, respectively. Independent of the algorithm, a selfalignment will include the global match along the diagonal of the alignment grid but will also show smaller (offdiagonal) matches repre...