Amihood Amir scite author profile

The current explosion of stored information necessitates a new model of pattern matching, that of compressed matching. In this model one tries to find all occurrences of a pattern in a compressed text in time proportional to the compressed text size, i.e., without decompressing the text. The most effective general purpose compression algorithms are adaptive, in that the text represented by each compression symbol is determined dynamically by the data. As a result, the encoding of a substring depends on its location. Thus the same substring may``look different'' every time it appears in the compressed text. In this paper we consider pattern matching without decompression in the UNIX Z-compression. This is a variant of the Lempel Ziv adaptive compression scheme. If n is the length of the compressed text and m is the length of the pattern, our algorithms find the first pattern occurrence in time O(n+m 2 ) or O(n log m+m). We also introduce a new criterion to measure compressed matching algorithms, that of extra space. We show how to modify our algorithms to achieve a trade-off between the amount of extra space used and the algorithm's time complexity.

show abstract

Alphabet dependence in parameterized matching

Amir

Farach

Muthukrishnan

1994

Information Processing Letters

102

View full text Add to dashboard Cite

Text Indexing and Dictionary Matching with One Error

Amir¹,

Keselman

Landau

et al. 2000

Journal of Algorithms

View full text Add to dashboard Cite

On Hardness of Jumbled Indexing

Amir

Chan

Lewenstein

et al. 2014

View full text Add to dashboard Cite

Abstract. Jumbled indexing is the problem of indexing a text T for queries that ask whether there is a substring of T matching a pattern represented as a Parikh vector, i.e., the vector of frequency counts for each character. Jumbled indexing has garnered a lot of interest in the last four years; for a partial list see [2,6,13,16,17,20,22,24,26,30,35,36]. There is a naive algorithm that preprocesses all answers in O(n 2 |Σ|) time allowing quick queries afterwards, and there is another naive algorithm that requires no preprocessing but has O(n log |Σ|) query time. Despite a tremendous amount of effort there has been little improvement over these running times. In this paper we provide good reason for this. We show that, under a 3SUM-hardness assumption, jumbled indexing for alphabets of size ω(1) requires Ω(n 2−ǫ ) preprocessing time or Ω(n 1−δ ) query time for any ǫ, δ > 0. In fact, under a stronger 3SUM-hardness assumption, for any constant alphabet size r ≥ 3 there exist describable fixed constant ǫr and δr such that jumbled indexing requires Ω(n 2−ǫr ) preprocessing time or Ω(n 1−δr ) query time.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Amihood Amir

Maximum Agreement Subtree in a Set of Evolutionary Trees: Metrics and Efficient Algorithms

Let Sleeping Files Lie: Pattern Matching in Z-Compressed Files

Alphabet dependence in parameterized matching

Text Indexing and Dictionary Matching with One Error

On Hardness of Jumbled Indexing

Contact Info

Product

Resources

About