1995
DOI: 10.1089/cmb.1995.2.417
|View full text |Cite
|
Sign up to set email alerts
|

Exceptional Motifs in Different Markov Chain Models for a Statistical Analysis of DNA Sequences

Abstract: Identifying exceptional motifs is often used for extracting information from long DNA sequences. The two difficulties of the method are the choice of the model that defines the expected frequencies of words and the approximation of the variance of the difference T(W) between the number of occurrences of a word W and its estimation. We consider here different Markov chain models, either with stationary or periodic transition probabilities. We estimate the variance of the difference T(W) by the conditional varia… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
61
0

Year Published

1997
1997
2022
2022

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 82 publications
(62 citation statements)
references
References 25 publications
1
61
0
Order By: Relevance
“…Statistical methods were used to determine whether Chi Hi (compared with Chi Ec ) occurs as a random or non-random sequence on the H. influenzae chromosome when the mono-, di-through heptanucleotide composition of the genome is taken into account (Schbath et al, 1995;see …”
Section: Statistical Over-representation Of Chi Hi Compared With Chi Ecmentioning
confidence: 99%
See 2 more Smart Citations
“…Statistical methods were used to determine whether Chi Hi (compared with Chi Ec ) occurs as a random or non-random sequence on the H. influenzae chromosome when the mono-, di-through heptanucleotide composition of the genome is taken into account (Schbath et al, 1995;see …”
Section: Statistical Over-representation Of Chi Hi Compared With Chi Ecmentioning
confidence: 99%
“…Briefly, statistical analyses of genome sequences reveal a non-random organization of nucleotides (Blaisdell, 1984;Pevzner, 1992;Schbath et al, 1995;Karlin and Burge, 1995). Examples of this are seen in dinucleotide occurrence that is non-random and is proposed to be the signature of a particular microorganism (Karlin and Burge, 1995), or trinucleotide representation on the genome, which is determined in part by codon usage (Brendel et al, 1986).…”
Section: Sequence Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…This simple model is useful to identify exceptional long words, given (m + 1 ) -w ords frequencies. Prum et al (1995) and Schbath et al (1995) study the normal approximation of N(W) corresponding to the asymptotic frame where the expectation of N(W) c o n verges to in nity with n. If the expectation of N(W) i s bounded when n increases, we t h e n s a y that W is rare, and Poisson approximations are more reasonable. We s h o w here that the number of overlapping occurrences of a rare word can be approximated by a compound Poisson variable, which reduces to a Poisson variable if the word W cannot overlap itself.…”
mentioning
confidence: 99%
“…Various statistical assessments of unusual abundance and rarity of individual words, including individual palindromes, in nucleotide sequences have been done using randomsequence models in a number of previous studies (Karlin et al 1992;Merkl and Fritz 1996;Rocha et al 1998Rocha et al , 2001Schbath et al 1995, to name just a few). The present study, however, aims at investigating the unusual abundance and rarity of palindromes collectively rather than individually.…”
Section: Discussionmentioning
confidence: 99%