2015
DOI: 10.14778/2856318.2856329
|View full text |Cite
|
Sign up to set email alerts
|

Explaining query answers with explanation-ready databases

Abstract: With the increased generation and availability of big data in different domains, there is an imminent requirement for data analysis tools that are able to 'explain' the trends and anomalies obtained from this data to a range of users with different backgrounds. Wu-Madden (PVLDB 2013) and Roy-Suciu (SIGMOD 2014) recently proposed solutions that can explain interesting or unexpected answers to simple aggregate queries in terms of predicates on attributes. In this paper, we propose a generic framework that can su… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
41
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 62 publications
(41 citation statements)
references
References 25 publications
0
41
0
Order By: Relevance
“…Their origin stems from data provenance [12], but have since departed from such notions, and focus on explanations in varying contexts. One of such contexts is to provide explanations, represented as predicates, for simple query answers as in [15], [27], [31]. Explanations have been also used for interpreting outliers in both in-situ data [38] and in streaming data [7].…”
Section: Related Work and Contributionmentioning
confidence: 99%
See 1 more Smart Citation
“…Their origin stems from data provenance [12], but have since departed from such notions, and focus on explanations in varying contexts. One of such contexts is to provide explanations, represented as predicates, for simple query answers as in [15], [27], [31]. Explanations have been also used for interpreting outliers in both in-situ data [38] and in streaming data [7].…”
Section: Related Work and Contributionmentioning
confidence: 99%
“…Scalability/efficiency is particularly important: Computing explanations is proved to be an NP-Hard problem [35] and generating them can take a long time [15], [31], [38] even with modest datasets. An exponential increase in data size implies a dramatic increase in the time to generate explanations.…”
Section: Related Work and Contributionmentioning
confidence: 99%
“…In the backward pass, we aim to find an explanation for a detected anomaly. There are two parallel efforts on explanation discovery: In the database community, recent work [13,15] aims to discover explanations of query results, but is limited to results of group-by aggregate queries. In the machine learning community, sensitivity tests [11,12,14] are designed to determine the importance of each input feature.…”
Section: Backward Pass: Explanation Discoverymentioning
confidence: 99%
“…In this position paper, we argue for the need of a new type of data stream analytics that can address anomaly detection and explanation discovery in a single, integrated system. In the literature these two topics have been addressed in isolation: while anomaly detection has been studied intensively in the data mining community [3,5,6], explanation discovery with the goal of human-readable formulas has recently received attention in the database community [13,15,18]. On the latter topic, the line of work [13,15] explains outliers for only group-by aggregate queries and finds a logical formula to describe a subset of tuples that contribute the most to the excessively high or low aggregate value of a specific group.…”
Section: Introductionmentioning
confidence: 99%
“…The related works in [104,105,129] consider the problem of finding explanations for outliers in their query answers. An example of these outliers can be an aggregate value that is abnormally different from the rest of the values in an answer produced by a group-by SQL query.…”
Section: Explain Outliers In Query Answer Problemmentioning
confidence: 99%