Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data 2014
DOI: 10.1145/2588555.2588574
|View full text |Cite
|
Sign up to set email alerts
|

Mining statistically significant connected subgraphs in vertex labeled graphs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
8
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(8 citation statements)
references
References 30 publications
0
8
0
Order By: Relevance
“…This is not a trivial question as the search space in graph mining is often exponentially larger than that in itemset mining due to combinations of vertices and edges. In this paper, we give a positive answer to this question by (1) extending the approach by Terada et al [32] to solve the important open problem of significant subgraph mining with multiple testing correction via frequent subgraph mining [7,15,23,40], (2) proposing efficient search strategies for detecting testable subgraphs, one of which is empirically orders of magnitude faster than their method, and (3) further improving over naïve Bonferroni correction by considering the dependence between subgraph occurrences [22,24].…”
Section: Introductionmentioning
confidence: 99%
“…This is not a trivial question as the search space in graph mining is often exponentially larger than that in itemset mining due to combinations of vertices and edges. In this paper, we give a positive answer to this question by (1) extending the approach by Terada et al [32] to solve the important open problem of significant subgraph mining with multiple testing correction via frequent subgraph mining [7,15,23,40], (2) proposing efficient search strategies for detecting testable subgraphs, one of which is empirically orders of magnitude faster than their method, and (3) further improving over naïve Bonferroni correction by considering the dependence between subgraph occurrences [22,24].…”
Section: Introductionmentioning
confidence: 99%
“…However, when the data is noisy and considered to be a random sample from the population of interest, it is desired to provide statistical significance measures such as p-values or confidence intervals for each of the discovered patterns. Although several researchers in data mining community studied how to compute statistical significances of the discovered patterns [11,12,13,14], the reported p-values in these studies are biased in the sense that the selection effect of the mining algorithms are not taken into account (unless a multiple testing correction procedure is applied to these p-values afterward).…”
Section: Related Approachesmentioning
confidence: 99%
“…Rather, identifying the statistically significant attribute associations where the pattern of the attribute association deviates from the expected, can potentially infer undiscovered possible relationships between nodes in the graph. The statistical significance of a pattern has been emphasized in various data mining problems [12], [13], [7], [14], [15] and the previous works already explored why a statistically significant pattern is more important rather than a frequent pattern. Thus, in this paper we define a statistically significant attribute association and address the problem of uncovering it in attributed graphs.…”
Section: Introductionmentioning
confidence: 99%