Abstract. We prove that a document collection, represented as a unique sequence T of n terms over a vocabulary Σ, can be represented in nH0(T ) + o(n)(H0(T ) + 1) bits of space, such that a conjunctive query t1 ∧ · · · ∧ t k can be answered in O(kδ log log |Σ|) adaptive time, where δ is the instance difficulty of the query, as defined by Barbay and Kenyon in their SODA'02 paper, and H0(T ) is the empirical entropy of order 0 of T . As a comparison, using an inverted index plus the adaptive intersection algorithm by Barbay and Kenyon takes O(kδ log), where nM is the length of the shortest and longest occurrence lists, respectively, among those of the query terms. Thus, we can replace an inverted index by a more space-efficient in-memory encoding, outperforming the query performance of inverted indices when the ratio n M δ is ω(log |Σ|).
Let SO plog denote the restriction of second-order logic, where second-order quantification ranges over relations of size at most poly-logarithmic in the size of the structure. In this article we investigate the problem, which Turing machine complexity class is captured by Boolean queries over ordered relational structures that can be expressed in this logic. For this we define a hierarchy of fragments Σ plog m (and Π plog m ) defined by formulae with alternating blocks of existential and universal second-order quantifiers in quantifier-prenex normal form. We first show that the existential fragment Σ plog 1 captures NPolyLogTime, i.e. the class of Boolean queries that can be accepted by a non-deterministic Turing machine with random access to the input in time O((log n) k ) for some k ≥ 0. Using alternating Turing machines with random access input allows us to characterise also the fragments Σ plog m (and Π plog m ) as those Boolean queries with at most m alternating blocks of second-order quantifiers that are accepted by an alternating Turing machine. Consequently, SO plog captures the whole poly-logarithmic time hierarchy. We demonstrate the relevance of this logic and complexity class by several problems in database theory. ACM Subject Classification Theory of computation → Finite Model TheoryAccording to Immerman, the credo of descriptive complexity theory is that "the computational complexity of all problems in Computer Science can be understood via the complexity of their logical descriptions" [17, p.5]. Starting from Fagin's fundamental result [8] that the existential fragment SO∃ of second-order logic over finite relational structures captures all decision problems that are accepted by a non-deterministic Turing machine in polynomial 23:3 deterministic Turing machine with random access to the input in time O((log n) k ) for some k ≥ 0. Using alternating Turing machines with random access input allows us to characterise also the fragments Σ plog m (and Π plog m ) as those Boolean queries with at most m alternating blocks of second-order quantifiers that are accepted by an alternating Turing machine. That is, we obtain Σ plog m =Σ plog m (and Π plog m =Π plog m ). Consequently, SO plog captures the whole poly-logarithmic time hierarchy PLH.1 The result in [2] is actually more general, allowing any set of Boolean functions F of n O(1) inputs complying with a padding property and containing the functions OR and AND. The restricted secondorder logic is defined by extending first-order logic with a second-order quantifier Q f for each f ∈ F which range over relations on the sub-domain {1, . . . , log n}, where n is the size of the interpreting structure. The case related to our result is when F = {OR, AND}, which gives raise to restricted existential and universal second-order quantifiers.
We propose logical characterizations of problems solvable in deterministic polylogarithmic time (PolylogTime) and polylogarithmic space (PolylogSpace). We introduce a novel two-sorted logic that separates the elements of the input domain from the bit positions needed to address these elements. We prove that the inflationary and partial fixed point vartiants of this logic capture PolylogTime and PolylogSpace, respectively. In the course of proving that our logic indeed captures PolylogTime on finite ordered structures, we introduce a variant of random-access Turing machines that can access the relations and functions of a structure directly. We investigate whether an explicit predicate for the ordering of the domain is needed in our PolylogTime logic. Finally, we present the open problem of finding an exact characterization of order-invariant queries in PolylogTime.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.