Estimating and sampling graphs with multidimensional random walks

Ribeiro, Bruno; Towsley, Don

doi:10.1145/1879141.1879192

Cited by 285 publications

(261 citation statements)

References 35 publications

(61 reference statements)

Supporting

Mentioning

258

Contrasting

Order By: Relevance

“…In particular, we conclude from (20) that for small values of α the value of the spectral gap can be approximated as follows:…”

Section: Theorem 23 Given That the Original Graph Has Two Principal mentioning

confidence: 91%

“…Within crawl-based sampling methods, random walk (RW) sampling is among the most popular methods [5,11,12,18,20,23]. Let G = (V, E) be an undirected, non-bipartite graph with n nodes.…”

Section: Introductionmentioning

confidence: 99%

“…In the real-world, however, networks may consist of several disconnected components, e.g. Twitter [22] and Livejournal [20], to cite two known examples. Moreover, the performance of such methods are closely tied to the difference between the largest and the second largest eigenvalues of the associated RW transition probability matrix.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Improving Random Walk Estimation Accuracy with Uniform Restarts

Avrachenkov

Ribeiro

Towsley

2010

Algorithms and Models for the Web-Graph

Self Cite

152

View full text Add to dashboard Cite

This work proposes and studies the properties of a hybrid sampling scheme that mixes independent uniform node sampling and random walk (RW)-based crawling. We show that our sampling method combines the strengths of both uniform and RW sampling while minimizing their drawbacks. In particular, our method increases the spectral gap of the random walk, and hence, accelerates convergence to the stationary distribution. The proposed method resembles PageRank but unlike PageRank preserves time-reversibility. Applying our hybrid RW to the problem of estimating degree distributions of graphs shows promising results. Key-words:

show abstract

“…In particular, we conclude from (20) that for small values of α the value of the spectral gap can be approximated as follows:…”

Section: Theorem 23 Given That the Original Graph Has Two Principal mentioning

confidence: 91%

“…Within crawl-based sampling methods, random walk (RW) sampling is among the most popular methods [5,11,12,18,20,23]. Let G = (V, E) be an undirected, non-bipartite graph with n nodes.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improving Random Walk Estimation Accuracy with Uniform Restarts

Avrachenkov

Ribeiro

Towsley

2010

Algorithms and Models for the Web-Graph

Self Cite

152

View full text Add to dashboard Cite

show abstract

“…In this paper we consider six sampling methods (see Table II): 1) random-walk sampling (RW) [8]; 2) random-walk sampling with uniform restarts (RWJ and RWU) [8]; 3) frontier sampling (FS) [9]; 4) expansion sampling (XS) [10]; 5) adjusted expansion sampling (AXS); and 6) randomized expansion sampling (RXS). The first four methods are described in [8], [9], [10].…”

Section: A Sampling Methodsmentioning

confidence: 99%

“…The first four methods are described in [8], [9], [10]. For convenience, in what follows we briefly describe XS.…”

Section: A Sampling Methodsmentioning

confidence: 99%

Online estimating the k central nodes of a network

Lim¹,

Menasché²,

Ribeiro³

et al. 2011

2011 IEEE Network Science Workshop

Self Cite

View full text Add to dashboard Cite

ExPregel: a new computational model for large‐scale graph processing

Sagharichian

Naderi

Haghjoo

2015

Concurrency and Computation

View full text Add to dashboard Cite

These days, large-scale graph processing becomes more and more important. Pregel, inspired by Bulk Synchronous Parallel, is one of the highly used systems to process large-scale graph problems. In Pregel, each vertex executes a function and waits for a superstep to communicate its data to other vertices. Superstep is a very time-consuming operation, used by Pregel, to synchronize distributed computations in a cluster of computers. However, it may become a bottleneck when the number of communications increases in a graph with million vertices. Superstep works like a barrier in Pregel that increases the side effect of skew problem in distributed computing environment. ExPregel is a Pregel-like model that is designed to reduce the number of communication messages between two vertices resided on two different computational nodes. We have proven that ExPregel reduces the number of exchanged messages as well as the number of supersteps for all graph topologies. Enhancing parallelism in our new computational model is another important feature that manifolds the speed of graph analysis programs. More interestingly, ExPregel uses the same model of programming as Pregel. Our experiments on large-scale real-world graphs show that ExPregel can reduce network traffic as well as number of supersteps from 45% to 96%. Runtime speed up in the proposed model varies from 1.2× to 30×.shared-memory fashion [8,9]. This approach, to some extent, tries to overcome the problems of the previous approach. However, it has scalability and fault-tolerance problems. Adopting graphics processing units to accelerate various graph-processing tasks forms another approach [10,11]. Sampling approach was used in [12,13] to overcome the problem of scalability in massive data. They divided the input graph into various sub-graphs and then estimated the property of the main graph according to the properties of the smaller sub-graphs. One of the main problems of the sampling approach is the large difference between the real solution and the estimated one.In contrast to these approaches, distributed-memory approach uses a commodity of computers, and it is a general solution to scalability, performance, and availability problems. It can be particularly used to solve massive graph problems. In [14,15], distributed frameworks were used to shrink processing overheads among computational nodes. In particular, MapReduce has emerged as an enabling technology for big data processing [14,16,17]. While MapReduce simplifies implementation of large-scale data-processing systems, it does not naturally and efficiently support many important graph-processing algorithms and may lead to inefficient solutions.In 2010, Google proposed a computational model so-called Pregel [1] dedicated to large-scale graph processing. It is inspired by Valiant's Bulk Synchronous Parallel (BSP) model and facilitates implementing distributed graph algorithms. A program in Pregel consists of a sequence of iterations, called superstep. During a superstep, Pregel invokes a user-defined function ...

show abstract

Estimating and sampling graphs with multidimensional random walks

Cited by 285 publications

References 35 publications

Improving Random Walk Estimation Accuracy with Uniform Restarts

Improving Random Walk Estimation Accuracy with Uniform Restarts

Online estimating the k central nodes of a network

ExPregel: a new computational model for large‐scale graph processing

Contact Info

Product

Resources

About