1987
DOI: 10.1145/28869.28873
|View full text |Cite
|
Sign up to set email alerts
|

Complete inverted files for efficient text retrieval and analysis

Abstract: Abstract. Given a finite set of texts S = (wi , *.., wk) over some fixed finite alphabet 2, a complete inverted tile for S is an abstract data type that provides the functionsfind( which returns the longest prefix of w that occurs (as a subword of a word) in S, freq(w), which returns the number of times w occurs in S, and locations(w), which returns the set of positions where w occurs in S. A data structure. that implements a complete inverted file for S that occupies linear space and can be built in linear ti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
157
0
1

Year Published

1994
1994
2016
2016

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 202 publications
(163 citation statements)
references
References 12 publications
0
157
0
1
Order By: Relevance
“…In this subsection, we recall the equivalence relations introduced by Blumer et al [10,1], and then state their properties. Throughout this paper, we consider the equivalence classes of the input string w that ends with a distinct symbol $ that does not appear anywhere else in w. For any string x ∈ Substr(w), let,…”
Section: Equivalence Relations On Stringsmentioning
confidence: 99%
See 3 more Smart Citations
“…In this subsection, we recall the equivalence relations introduced by Blumer et al [10,1], and then state their properties. Throughout this paper, we consider the equivalence classes of the input string w that ends with a distinct symbol $ that does not appear anywhere else in w. For any string x ∈ Substr(w), let,…”
Section: Equivalence Relations On Stringsmentioning
confidence: 99%
“…However, a given text contains too many substrings to browse or analyze. A reasonable approach is to partition the set of substrings into equivalence classes under the equivalence relation of [1] so that an expert can examine the classes one by one [3]. This equivalence relation groups together substrings that correspond to essentially identical occurrences in the text.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Previous data structures for this problem include the suffix tree [1], the compact directed acyclic word graph (compact DAWG) [2], and the suffix array [3]. The first two approaches take O(n) time to build the data structure, and O(m + k) time to find the k positions where the pattern string occurs.…”
Section: Introductionmentioning
confidence: 99%