2004
DOI: 10.1007/978-3-540-30213-1_37
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Extraction of Structured Motifs Using Box-Links

Abstract: In this paper, we propose a new algorithm for the extraction of repeated motifs that may represent binding-site consensi in genomic sequences. In particular, the algorithm extracts structured motifs, which we define as a collection of highly conserved motifs with pre-specified sizes and spacings between them. This type of motifs is highly relevant in the search for gene regulatory mechanisms since promoter models can be effectively represented by structured motifs.The algorithm uses factor trees, a variation o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2006
2006
2016
2016

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 14 publications
0
11
0
Order By: Relevance
“…The idea is to replace each edge with the start and end indexes (positions) in the input string S for the label (substring) on the edge. For example, the edge labeled ACG from the root node, will be encoded as [0, 2], whereas edge labeled ACG leading to leaf (suffix) 0, will be encoded as [3,5]. Using edge-encoding, each edge requires only a constant amount of space, so the total space required for the whole tree is O(n).…”
Section: Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…The idea is to replace each edge with the start and end indexes (positions) in the input string S for the label (substring) on the edge. For example, the edge labeled ACG from the root node, will be encoded as [0, 2], whereas edge labeled ACG leading to leaf (suffix) 0, will be encoded as [3,5]. Using edge-encoding, each edge requires only a constant amount of space, so the total space required for the whole tree is O(n).…”
Section: Preliminariesmentioning
confidence: 99%
“…Some of the approaches [22,23,26,27,28] completely abandon the use of suffix links and sacrifice the theoretically superior linear construction time in exchange for a quadratic time algorithm with better locality of reference. However, many fast string-processing algorithms, such as tandem repeats [19], structural motifs [5], approximate string matching [6], and genome alignment [10,3,21], rely heavily on suffix links for efficiency, and thus cannot be used directly with these disk-based suffix trees. Some approaches [22,23,26,4] also suffer from the skewed partitions problem.…”
Section: Introductionmentioning
confidence: 99%
“…Many important string processing applications require suffix links [5,6,12]. For such applications, we invoke the opInput: Input string S, Set of suffix sub-trees T , Block Size B Output: Set of suffix sub-trees T with suffix links /* Phase 1 */ ;…”
Section: Suffix Link Recoverymentioning
confidence: 99%
“…RISO [15-17] improves SMILE in two aspects. First, instead of building the whole suffix tree for the input sequences, RISO builds a suffix tree only up to a certain level l , called a factor tree , which leads to a large space saving.…”
Section: Related Workmentioning
confidence: 99%