We propose a compressed self-index based the edit-sensitive parsing (ESP). Given a string S, its ESP tree is equivalent to a contextfree grammar deriving just S, which can be represented as a DAG G. Finding pattern P in S is reduced to embedding P into G. Succinct data structures are adopted and G is then decomposed into two LOUDS bit strings and a single array for permutation, requiring (1 + ε)n log n + 4n + o(n) bits for any 0 < ε < 1 where n corresponds to the number of different symbols in the grammar. The time to count the occurrences of P in S is in O(log * u ε (m log n+occc(log m log u))), where m = |P |, u = |S|, and occc is the number of occurrences of a maximal common subtree in ESP trees of P and S. Using an additional array in n log u bits of space, our index supports locating P and displaying substring of S. Locating time is the same as counting time and displaying time for a substring of length m is O(m + log u). He moved to Hitachi Solutions, Ltd. This work was partially supported by JST PRESTO program. Lemma 1. (Cormode and Muthukrishnan [3]) The height of ET (S) is O(log |S|) and ET (S) can be computed in time O(|S| log * |S|) time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.