2014
DOI: 10.1109/tkde.2013.129
|View full text |Cite
|
Sign up to set email alerts
|

Large-Scale Pattern Search Using Reduced-Space On-Disk Suffix Arrays

Abstract: Abstract-The suffix array is an efficient data structure for in-memory pattern search. Suffix arrays can also be used for external-memory pattern search, via two-level structures that use an internal index to identify the correct block of suffix pointers. In this paper we describe a new two-level suffix array-based index structure that requires significantly less disk space than previous approaches. Key to the saving is the use of disk blocks that are based on prefixes rather than the more usual uniform-sampli… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 32 publications
0
11
0
Order By: Relevance
“…At the end of the MLERP process the following occurrences have been discovered with the related positions: i (1,4,7,10), p (8,9), s (2,3,5,6), si (3,6), ssi (2,5) and issi (1,4). The results are exactly the same as in the example presented in [8] with the same time series where the whole suffix array was processed.…”
Section: Algorithms' Analysismentioning
confidence: 66%
See 4 more Smart Citations
“…At the end of the MLERP process the following occurrences have been discovered with the related positions: i (1,4,7,10), p (8,9), s (2,3,5,6), si (3,6), ssi (2,5) and issi (1,4). The results are exactly the same as in the example presented in [8] with the same time series where the whole suffix array was processed.…”
Section: Algorithms' Analysismentioning
confidence: 66%
“…Yet, although many algorithms exist that take advantage of the previously mentioned data structures, there are no algorithms for the detection of all repeated patterns. Of course, any of the already existing algorithms, e.g., [2][3][4][5][6] and [7] can be used for the detection of all repeated patterns, yet, we will show that this is unfeasible for very long time series and even for patterns with very small length inside any kind of time series. Another feature of our methodology is that although the COV algorithm can be directly executed in memory, we prefer to store the suffix array data structure on an external database management system.…”
Section: Introductionmentioning
confidence: 90%
See 3 more Smart Citations