2002
DOI: 10.1103/physreve.66.031913
|View full text |Cite
|
Sign up to set email alerts
|

Simplifying the mosaic description of DNA sequences

Abstract: By using the Jensen-Shannon divergence, genomic DNA can be divided into compositionally distinct domains through a standard recursive segmentation procedure. Each domain, while significantly different from its neighbours, may however share compositional similarity with one or more distant (nonneighbouring) domains. We thus obtain a coarse-grained description of the given DNA string in terms of a smaller set of distinct domain labels. This yields a minimal domain description of a given DNA sequence, significant… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2003
2003
2016
2016

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(16 citation statements)
references
References 24 publications
0
16
0
Order By: Relevance
“…Guha et al [10] provide a graceful tradeoff between the running time and the quality of the obtained solution. Azad et al [2], Li [16], and Ramensky et al [20] apply segmentation on genomic sequences, while Koivisto et al [15] use segmentation to find blocks on haplotypes. One should note that in statistics the question of segmentation of a sequence or time series is often called the change-point problem [6].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Guha et al [10] provide a graceful tradeoff between the running time and the quality of the obtained solution. Azad et al [2], Li [16], and Ramensky et al [20] apply segmentation on genomic sequences, while Koivisto et al [15] use segmentation to find blocks on haplotypes. One should note that in statistics the question of segmentation of a sequence or time series is often called the change-point problem [6].…”
Section: Related Workmentioning
confidence: 99%
“…In order to reduce the dimensionality from d to m we perform PCA on the set of weighted points U S . 2 The PCA computation gives for each segment vector u j an approximate representation u j such that…”
Section: Algorithmmentioning
confidence: 99%
“…In his formulation he uses hidden Markov states to model different compositional properties within each DNA segment, and his solution uses the Viterbi algorithm to determine the most probable sequence of states. More recently, Azad et al [2] formulate a similar problem to (k, h)-segmentation but their solution is based on greedy "split and merge" and it does not provide any theoretical guarantee.…”
Section: Introductionmentioning
confidence: 99%
“…The segmentation problem for genome sequences and time series has been discussed widely; see, e.g., [10,3,7,9,5,16,14,15,11,4,2]. For a wide variety of score functions, dynamic programming can be used to obtain the best segmentation of an n-element sequence into k pieces in time O(n 2 k); this idea goes back at least to Bellman [3] in 1961.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation