2006
DOI: 10.1145/1120582.1120586
|View full text |Cite
|
Sign up to set email alerts
|

Hidden word statistics

Abstract: We consider the sequence comparison problem, also known as "hidden" pattern problem, where one searches for a given subsequence in a text (rather than a string understood as a sequence of consecutive symbols). A characteristic parameter is the number of occurrences of a given pattern w of length m as a subsequence in a random text of length n generated by a memoryless source. Spacings between letters of the pattern may either be constrained or not in order to define valid occurrences. We determine the mean and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

2
68
0

Year Published

2006
2006
2013
2013

Publication Types

Select...
4
3
1

Relationship

4
4

Authors

Journals

citations
Cited by 39 publications
(70 citation statements)
references
References 34 publications
2
68
0
Order By: Relevance
“…The latter case was already discussed in [10], [9]. We derive our results by relating the the conditional probability distribution of the output of the deletion channel given the input to the so called hidden pattern matching analyzed recently in [1], [7].…”
mentioning
confidence: 82%
See 1 more Smart Citation
“…The latter case was already discussed in [10], [9]. We derive our results by relating the the conditional probability distribution of the output of the deletion channel given the input to the so called hidden pattern matching analyzed recently in [1], [7].…”
mentioning
confidence: 82%
“…Let Ω x (w) denote the number of occurrences of w as a subsequence (i.e., not consecutive symbols) of x, that is, (1) where I A = 1 if A is true and zero otherwise. The problem of counting subsequences in a text is known as the hidden pattern matching problem and was studied in [1], [7]. In this paper, to derive our results we first represent the mutual information between the input and output of a deletion channel in terms of the count Ω X (w) for a random sequence X. Theorem 1.…”
mentioning
confidence: 99%
“…Pattern matching in constrained sequences can in principle be analyzed by various versions of the de Bruijn graph [2,7] or automaton approach [2,17]. This is an elegant and general approach but it sometimes leads to complicated analyses and is computationally extensive.…”
Section: Introductionmentioning
confidence: 99%
“…Pattern matching in constrained sequences can in principle be analyzed by various versions of the de Bruijn graph [2], [6] or automaton approach [2], [14]. This is an elegant and general approach but it sometimes leads to complicated analyses and is computationally extensive.…”
Section: Introductionmentioning
confidence: 99%
“…Based on this method, one represents the number of pattern occurrences as a product of a matrix representation of the underlying de Bruijn graph and hence its largest eigenvalue (cf. [2], [6]). In general, this matrix is of a large dimension and such a solution is not easily interpretable in terms of the original patterns.…”
Section: Introductionmentioning
confidence: 99%