2012 IEEE 12th International Conference on Data Mining 2012
DOI: 10.1109/icdm.2012.87
|View full text |Cite
|
Sign up to set email alerts
|

GUISE: Uniform Sampling of Graphlets for Large Graph Analysis

Abstract: Graphlet frequency distribution (GFD) has recently become popular for characterizing large networks. However, the computation of GFD for a network requires the exact count of embedded graphlets in that network, which is a computationally expensive task. As a result, it is practically infeasible to compute the GFD for even a moderately large network. In this paper, we propose GUISE, which uses a Markov Chain Monte Carlo (MCMC) sampling method for constructing the approximate GFD of a large network. Our experime… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
117
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 94 publications
(117 citation statements)
references
References 17 publications
0
117
0
Order By: Relevance
“…These algorithms provide exact results, and here we will also concentrate on exact frequency computation, but we should note that there exist some sampling alternatives for providing approximate results. Some examples are Rand-ESU [11], Randomized g-tries [15] and GUISE [16].…”
Section: B Related Workmentioning
confidence: 99%
“…These algorithms provide exact results, and here we will also concentrate on exact frequency computation, but we should note that there exist some sampling alternatives for providing approximate results. Some examples are Rand-ESU [11], Randomized g-tries [15] and GUISE [16].…”
Section: B Related Workmentioning
confidence: 99%
“…Known techniques for subgraph counting with large queries (e.g. [7,26]) employ similar graph traversal techniques, making PS consistent with the state of the art for subgraph counting as well as color coding.…”
Section: Procedures 2: Computing Projection Table Formentioning
confidence: 94%
“…Our query benchmark consists of the ten real world queries shown in Figure 8. The queries were derived from prior network analysis work spanning diverse domains: dros, ecoli1, ecoli2, brain1, brain2, brain3 -biological networks [22,19]; glet1, glet2 -graphlets [7]; wiki -collaboration networks [32]; youtubespam networks [24].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, there is an increased interest in sampling and other heuristic approaches for obtaining approximate counts of various graphlets (Bhuiyan, Rahman, Rahman and Al Hasan, 2012;Gonen and Shavitt, 2009). However, our approach focuses on exact graphlet counting and thus sampling methods are outside the scope of this paper.…”
Section: Related Workmentioning
confidence: 99%