Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems 2017
DOI: 10.1145/3034786.3056105
|View full text |Cite
|
Sign up to set email alerts
|

What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?

Abstract: Recent works on bounding the output size of a conjunctive query with functional dependencies and degree constraints have shown a deep connection between fundamental questions in information theory and database theory. We prove analogous output bounds for disjunctive datalog rules, and answer several open questions regarding the tightness and looseness of these bounds along the way. Our bounds are intimately related to Shannon-type information inequalities. We devise the notion of a "proof sequence" of a specif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
82
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 72 publications
(85 citation statements)
references
References 54 publications
3
82
0
Order By: Relevance
“…In the RAM model, output-sensitive join algorithms have been extensively studied. The running time of most algorithms is in form of O(IN w + OUT), where w is certain notion of width of the hypergraph Q [15,17,27,23]. However, it is not clear if this is optimal.…”
Section: Other Related Resultsmentioning
confidence: 99%
“…In the RAM model, output-sensitive join algorithms have been extensively studied. The running time of most algorithms is in form of O(IN w + OUT), where w is certain notion of width of the hypergraph Q [15,17,27,23]. However, it is not clear if this is optimal.…”
Section: Other Related Resultsmentioning
confidence: 99%
“…If a feature extraction query has width w, then its data complexity isÕ(N w ) for a database of size N , whereÕ hides logarithmic factors in N . Various width measures have been proposed recently, such as: the fractional edge cover number [20,8,37,38,55] to capture the asymptotic size of the results for join queries and the time to compute them; the fractional hypertree width [32] and the submodular width [7] to capture the time to compute Boolean conjunctive queries; the factorization width [42] to capture the size of the factorized results of conjunctive queries; the FAQ-width [6] that extends the factorization width from conjunctive queries to functional aggregate queries; and the sharp-submodular width [2] that improves on the previous widths for functional aggregate queries.…”
Section: Structure-aware Learningmentioning
confidence: 99%
“…Our implementation at LogicBlox makes use of generalizations of AGM to queries with func-tional dependencies and immaterialized predicates (such as a + b = c). These new bounds are based on a linear program whose variables are marginal entropies [4,5].…”
Section: Practical Implicationsmentioning
confidence: 99%
“…The second problem is to select a good variable ordering to run InsideOut on. In principle, one does not have to use the AGM-bound or the bounds from [4,5] to estimate the cost of an FAQ subquery. If one were to implement InsideOut inside any RDBMS, one could poll that RDBMS's optimizer to figure out the cost of a given variable ordering.…”
Section: Practical Implicationsmentioning
confidence: 99%