Yury Lifshits scite author profile

Yury Lifshits

5Publications

259Citation Statements Received

104Citation Statements Given

How they've been cited

How they cite others

Affiliations

Yahoo (United States), St. Petersburg Department of Steklov Institute of Mathematics, California Institute of Technology

Publications

Order By: Most citations

Processing Compressed Texts: A Tractability Border

Lifshits

121

View full text Add to dashboard Cite

Abstract. What kind of operations can we perform effectively (without full unpacking) with compressed texts? In this paper we consider three fundamental problems: (1) check the equality of two compressed texts, (2) check whether one compressed text is a substring of another compressed text, and (3) compute the number of different symbols (Hamming distance) between two compressed texts of the same length. We present an algorithm that solves the first problem in O(n 3 ) time and the second problem in O(n 2 m) time. Here n is the size of compressed representation (we consider representations by straight-line programs) of the text and m is the size of compressed representation of the pattern. Next, we prove that the third problem is actually #P-complete. Thus, we indicate a pair of similar problems (equivalence checking, Hamming distance computation) that have radically different complexity on compressed texts. Our algorithmic technique used for problems (1) and (2) helps for computing minimal periods and covers of compressed texts.

show abstract

Combinatorial Algorithms for Nearest Neighbors, Near-Duplicates and Small-World Design

Lifshits¹,

Zhang²

2009

View full text Add to dashboard Cite

We study the so called combinatorial framework for algorithmic problems in similarity spaces. Namely, the input dataset is represented by a comparison oracle that given three points x, y, y answers whether y or y is closer to x. We assume that the similarity order of the dataset satisfies the four variations of the following disorder inequality: if x is the a'th most similar object to y and y is the b'th most similar object to z, then x is among the D(a + b) most similar objects to z, where D is a relatively small disorder constant.Though the oracle gives much less information compared to the standard general metric space model where distance values are given, one can still design very efficient algorithms for various fundamental computational tasks. For nearest neighbor search we present deterministic and exact algorithm with almost linear time and space complexity of preprocessing, and near-logarithmic time complexity of search. Then, for near-duplicate detection we present the first known deterministic algorithm that requires just near-linear time + time proportional to the size of output. Finally, we show that for any dataset satisfying the disorder inequality a visibility graph can be constructed: all outdegrees are near-logarithmic and greedy routing deterministically converges to the nearest neighbor of a target in logarithmic number of steps. The later result is the first known work-around for Navarro's impossibility of generalizing Delaunay graphs.The technical contribution of the paper consists of handling "false positives" in data structures and an algorithmic technique up-aside-down-filter.

show abstract

Window Subsequence Problems for Compressed Texts

Cegielski¹,

Guessarian

Lifshits

et al. 2006

View full text Add to dashboard Cite

Querying and Embedding Compressed Texts

Lifshits

Lohrey

2006

View full text Add to dashboard Cite

Abstract. In this work the computational complexity of two simple string problems on compressed input strings is considered: the querying problem (What is the symbol at a given position in a given input string?) and the embedding problem (Can the first input string embedded into the second input string?). Straightline programs are used for text compression. It is shown that the querying problem becomes P-complete for compressed strings, while the embedding problem becomes hard for the complexity class Θ p 2 .

show abstract

Optimal Parameters for Locality-Sensitive Hashing

Slaney

Lifshits

2012

Proc. IEEE

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yury Lifshits

Processing Compressed Texts: A Tractability Border

Combinatorial Algorithms for Nearest Neighbors, Near-Duplicates and Small-World Design

Window Subsequence Problems for Compressed Texts

Querying and Embedding Compressed Texts

Optimal Parameters for Locality-Sensitive Hashing

Contact Info

Product

Resources

About