Suffix Arrays: A New Method for On-Line String Searches

Manber, Udi; Myers, Gene

doi:10.1137/0222058

Cited by 1,619 publications

(1,230 citation statements)

References 24 publications

Supporting

Mentioning

1,206

Contrasting

Unclassified

Order By: Relevance

“…With sophisticated techniques for proving upper and lower bounds on the complexity of searching V in lexicographic order, Andersson, Hagerup, Håstad and Petersson have proved in [1] that it requires Θ k log log n log log(4 + k log log n log n ) + k + log n time. This bound is worse than Θ(k + log n), obtained by searching V plus O(n) auxiliary locations (e.g., Manber and Myers [18]). Using permutations other than those resulting from sorting is a way to reach optimality: Franceschini and Grossi [10] have shown that for any set V of n vectors in lexicographic order, there exists a permutation of them allowing for Θ(k + log n) search time using O(1) auxiliary data locations.…”

Section: Introductionmentioning

confidence: 87%

See 1 more Smart Citation

Optimal In-place Sorting of Vectors and Records

Franceschini

Grossi

2005

Automata, Languages and Programming

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 87%

“…The pivots inside M B are kept searchable by a suitable blend of the techniques in [10,13,18], requiring to decode O(log n) heavy bits per inserted vector (which is fine since decoding takes O(1 + k/ log n) time). In particular, we logically divide each vector x into a concatenation of O(log m ) = O(log n) equally sized chunks.…”

Section: High-level Descriptionmentioning

confidence: 99%

Optimal In-place Sorting of Vectors and Records

Franceschini

Grossi

2005

Automata, Languages and Programming

View full text Add to dashboard Cite

show abstract

“…A suffix array allows us to rapidly find a file (or files), containing any given substring. This is achieved with a binary search, and requires O(m + log 2 n) time on average, where m is the length of the substring (it is also possible to make this the worst case complexity, see [4]). The array can be constructed in time O(n log n), assuming atomic comparison of two tokens.…”

Section: Algorithm 1 Compare a File Against An Existing Collectionmentioning

confidence: 99%

“…We use suffix array as an index structure. A suffix array is a lexicographically sorted array of all suffixes of a given string [4]. The suffix array for the whole document collection is of size O(n).…”

Section: Algorithm 1 Compare a File Against An Existing Collectionmentioning

confidence: 99%

See 1 more Smart Citation

Fast Plagiarism Detection System

Mozgovoy

Fredriksson

White

et al. 2005

String Processing and Information Retrieval

View full text Add to dashboard Cite

Reducing the space requirement of suffix trees

Kurtz

1999

Softw: Pract. Exper.

247

154

View full text Add to dashboard Cite

We show that suffix trees store various kinds of redundant information. We exploit these redundancies to obtain more space efficient representations. The most space efficient of our representations requires 20 bytes per input character in the worst case, and 10.1 bytes per input character on average for a collection of 42 files of different type. This is an advantage of more than 8 bytes per input character over previous work. Our representations can be constructed without extra space, and as fast as previous representations. The asymptotic running times of suffix tree applications are retained.

show abstract

Suffix Arrays: A New Method for On-Line String Searches

Abstract: A new and conceptually simple data structure, called a suffix array, for on-line string searches is intro

Cited by 1,619 publications

References 24 publications

Optimal In-place Sorting of Vectors and Records

Optimal In-place Sorting of Vectors and Records

Fast Plagiarism Detection System

Reducing the space requirement of suffix trees

Contact Info

Product

Resources

About