Proceedings of the 25th International Conference on World Wide Web 2016
DOI: 10.1145/2872427.2883045
|View full text |Cite
|
Sign up to set email alerts
|

On Sampling Nodes in a Network

Abstract: Random walk is an important tool in many graph mining applications including estimating graph parameters, sampling portions of the graph, and extracting dense communities. In this paper we consider the problem of sampling nodes from a large graph according to a prescribed distribution by using random walk as the basic primitive. Our goal is to obtain algorithms that make a small number of queries to the graph but output a node that is sampled according to the prescribed distribution. Focusing on the uniform di… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
43
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 54 publications
(45 citation statements)
references
References 28 publications
2
43
0
Order By: Relevance
“…We consider 13 real-world undirected network datasets from SNAP 4 and Network Repository 5 , which are listed in an ascending order of graph size in Table 1. 6 We classify four graphs whose sizes are smaller than 50K as small graphs and the rest of them as large graphs. In this experimental evaluation, we focus on computing the Kemeny's constant K of a simple random walk on the largest strongly connected components (LSCC) of a graph, as used in [28].…”
Section: Experiments Resultsmentioning
confidence: 99%
“…We consider 13 real-world undirected network datasets from SNAP 4 and Network Repository 5 , which are listed in an ascending order of graph size in Table 1. 6 We classify four graphs whose sizes are smaller than 50K as small graphs and the rest of them as large graphs. In this experimental evaluation, we focus on computing the Kemeny's constant K of a simple random walk on the largest strongly connected components (LSCC) of a graph, as used in [28].…”
Section: Experiments Resultsmentioning
confidence: 99%
“…Different from the bootstrap sampling methods, we focus on sampling subgraphs from large networks. To our knowledge, although a variety of graph sampling techniques have been introduced in [4], [11], our approach is the first work that combines link prediction characteristics [22] with graph sampling methods to achieve the high link prediction accuracy.…”
Section: Related Workmentioning
confidence: 99%
“…When the complete structure of a network is not available, a network sampling is often used to obtain a partial structure of the network [22], [23], [26]- [30], [33]. If arbitrary access to any nodes is allowed, random sampling seems to be a natural choice for simplicity because of its simplicity and neutrality.…”
Section: Introductionmentioning
confidence: 99%
“…If arbitrary access to any nodes is allowed, random sampling seems to be a natural choice for simplicity because of its simplicity and neutrality. However, in many real applications, random access to nodes is not allowed [30], [33], so crawl-based sampling techniques have been widely used for analyzing the structure of several types of large-scale networks, such as online social networks [23], the world wide web [34], and peer-topeer (P2P) networks [35]. When using crawl-based network sampling, it is assumed that only one node can be visited initially, but the neighbors of already visited nodes can be visited at each step.…”
Section: Introductionmentioning
confidence: 99%