Abstract. We study the enumeration complexity of the natural extension of acyclic conjunctive queries with disequalities. In this language, a number of NP-complete problems can be expressed. We first improve a previous result of Papadimitriou and Yannakakis by proving that such queries can be computed in time c.|M|.|ϕ(M)| where M is the structure, ϕ(M) is the result set of the query and c is a simple exponential in the size of the formula ϕ. A consequence of our method is that, in the general case, tuples of such queries can be enumerated with a linear delay between two tuples. We then introduce a large subclass of acyclic formulas called CCQ = and prove that the tuples of a CCQ = query can be enumerated with a linear time precomputation and a constant delay between consecutive solutions. Moreover, under the hypothesis that the multiplication of two n × n boolean matrices cannot be done in time O(n 2 ), this leads to the following dichotomy for acyclic queries: either such a query is in CCQ = or it cannot be enumerated with linear precomputation and constant delay. Furthermore we prove that testing whether an acyclic formula is in CCQ = can be performed in polynomial time. Finally, the notion of free-connex treewidth of a structure is defined. We show that for each query of free-connex treewidth bounded by some constant k, enumeration of results can be done with O(|M| k+1 ) precomputation steps and constant delay.
Massive graph data sets are pervasive in contemporary application domains.
Hence, graph database systems are becoming increasingly important. In the
experimental study of these systems, it is vital that the research community
has shared solutions for the generation of database instances and query
workloads having predictable and controllable properties. In this paper, we
present the design and engineering principles of gMark, a domain- and query
language-independent graph instance and query workload generator. A core
contribution of gMark is its ability to target and control the diversity of
properties of both the generated instances and the generated workloads coupled
to these instances. Further novelties include support for regular path queries,
a fundamental graph query paradigm, and schema-driven selectivity estimation of
queries, a key feature in controlling workload chokepoints. We illustrate the
flexibility and practical usability of gMark by showcasing the framework's
capabilities in generating high quality graphs and workloads, and its ability
to encode user-defined schemas across a variety of application domains.Comment: Accepted in November 2016. URL:
http://ieeexplore.ieee.org/document/7762945/. in IEEE Transactions on
Knowledge and Data Engineering 201
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.