2011
DOI: 10.1371/journal.pone.0018464
|View full text |Cite
|
Sign up to set email alerts
|

Algebraic Distribution of Segmental Duplication Lengths in Whole-Genome Sequence Self-Alignments

Abstract: Distributions of duplicated sequences from genome self-alignment are characterized, including forward and backward alignments in bacteria and eukaryotes. A Markovian process without auto-correlation should generate an exponential distribution expected from local effects of point mutation and selection on localised function; however, the observed distributions show substantial deviation from exponential form – they are roughly algebraic instead – suggesting a novel kind of long-distance correlation that must be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
32
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 24 publications
(34 citation statements)
references
References 44 publications
2
32
0
Order By: Relevance
“…statistical signature; i.e., it is well described by a power law with exponent À3 (black curve in Fig. 1), which can also be found in other species and was first reported by Gao and Miller [14]. Given this empirical result and considering that it only holds over one order of magnitude, it is reasonable to question if a meaningful model can be developed that also explains this finding mathematically; see Ref.…”
supporting
confidence: 64%
“…statistical signature; i.e., it is well described by a power law with exponent À3 (black curve in Fig. 1), which can also be found in other species and was first reported by Gao and Miller [14]. Given this empirical result and considering that it only holds over one order of magnitude, it is reasonable to question if a meaningful model can be developed that also explains this finding mathematically; see Ref.…”
supporting
confidence: 64%
“…A main evolutionary force of genome evolution that we have not considered so far is sequence duplication. Genomic duplication is a major source of genomic rearrangements and it is quite common across the whole phylogenetic tree [33,34].…”
Section: Genomic Duplication Is the Key Ingredient To Explain The mentioning
confidence: 99%
“…1 was¯rst obtained for a small number of chromosomes by self-alignment. 20 Sequence alignment is a heuristic process whose outcome depends on the method applied and parameters chosen. Only once the power-law behavior was reproduced as described here, wherein the distribution computed is de¯ned independently of any algorithm, could it be con¯dently asserted that the power-law behavior was not an artifact of the alignment method.…”
Section: Length Distributions Of Natural Genome Sequencesmentioning
confidence: 99%
“…For example, although the amplitudes of the curves often undergo multifold enhancement when bases are counted modulo the most frequent base substitution event, 38 C$T/ G$A, it turns out that the power-law behavior remains unchanged; this observation yields an important constraint on evolutionary dynamics. 20,39 Additionally, as much as half of each natural genome occurs as so-called \repetitive sequence a ;" length distributions of chromosomes from which these repetitive sequences have been excised (RM for repeat-masking) nevertheless exhibit a power-law. (see the plots for RM, 2b, and RM-2b in the supporting information.)…”
Section: Length Distributions Of Natural Genome Sequencesmentioning
confidence: 99%
See 1 more Smart Citation