Quick Detection of Top-k Personalized PageRank Lists

Avrachenkov, Konstantin; Litvak, Nelly; Nemirovsky, Danil; Смирнова, Елена; Sokol, Marina

doi:10.1007/978-3-642-21286-4_5

Cited by 44 publications

(36 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, we also show that -these can be obtained highly efficiently, if necessary, leveraging existing approximation algorithms [2,4,14,17,21,23,41] and/or parallel implementations [3,32] for computing the PPR scores, -the proposed formulations are reuse-promoting in the sense that, it is possible to divide the work relative to individual seed nodes and cache the intermediary results obtained during the computation -these cached results can then be reused for future queries sharing seed nodes, and -especially in systems with large query throughputs, it may be possible to cluster queries based on the partial overlaps between the seed sets and, thus, significantly reduce the overall robust PPR computation costs.…”

Section: Our Contributions: Robust Personalized Pagerank (Rpr)mentioning

confidence: 99%

“…This is especially advantageous when G is large as we can leverage any of the highly effective approximation algorithms [2,4,14,17,21,23,41] or parallelized implementations [3,32] for computing these PPR scores. Most importantly, the first step of the algorithm (where we solve a linear equation independently for each seed node) can be trivially parallelized by assigning each node to a different computation unit.…”

Section: Converting the Problem Into A Set Of Linear Equationsmentioning

confidence: 99%

“…Alternatively, PowerIteration [27] or using iterative approximations [14,30], which explicitly simulate the dissemination of probability mass by repeatedly applying the transition process to an initial distribution π 0 until a convergence criterion is satisfied. Recent advances on PPR computation include top-k and approximate personalized PageRank algorithms [2,4,14,17,21,23,41] and parallelized implementations on MapReduce or Pregel based systems [3,32,36,38]. The FastRWR algorithm [41], for example partitions the graph into subgraphs and indexes partial intermediary solutions.…”

Section: Obtaining Pagerank and Personalized Pagerank Scoresmentioning

confidence: 99%

See 2 more Smart Citations

Reducing seed noise in personalized PageRank

Huang

Candan

et al. 2016

Soc. Netw. Anal. Min.

View full text Add to dashboard Cite

Network based recommendation systems leverage the topology of the underlying graph and the current user context to rank objects in the database. Random-walk based techniques, such as PageRank, encode the structure of the graph in the form of a transition matrix of a stochastic process from which the significances of the nodes in the graph are inferred. Personalized PageRank (PPR) techniques complement this with a seed node set which serves as the personalization context. In this paper, we note (and experimentally show) that PPR algorithms that do not differentiate among the seed nodes may not properly rank nodes in situations where the seed set is incomplete and/or noisy. To tackle this problem, we propose alternative robust personalized PageRank (RPR) strategies, which are insensitive to noise in the set of seed nodes and in which the rankings are not overly biased towards the seed nodes. In particular, we show that novel teleportation discounting and seed-set maximal PPR techniques help eliminate harmful bias of individual seed nodes and provide effective seed differentiation to lead to more accurate rankings. We also show that the proposed techniques lead to efficient implementations, where existing approximation algorithms and/or parallel implementations for computing the PPR scores can be easily leveraged. Moreover, the proposed formulations are reuse-promoting in the sense that, it is possible to divide the work relative to individual seed nodes and cache the intermediary results obtained during the computation, and especially in systems with large query throughputs, it may be possible to cluster queries based on the partial overlaps between the seed sets and reduce the overall robust PPR computation costs. Experiment results show that the proposed techniques are efficient and highly effective in improving recommendations and eliminating unwanted bias due to imperfections in the seed set.

show abstract

Section: Our Contributions: Robust Personalized Pagerank (Rpr)mentioning

confidence: 99%

Section: Converting the Problem Into A Set Of Linear Equationsmentioning

confidence: 99%

Section: Obtaining Pagerank and Personalized Pagerank Scoresmentioning

confidence: 99%

See 1 more Smart Citation

Reducing seed noise in personalized PageRank

Huang

Candan

et al. 2016

Soc. Netw. Anal. Min.

View full text Add to dashboard Cite

show abstract

“…The number n = |V | (m = |E|) of nodes (edges) of each graph are shown in Table 2. Web-stanford-cs 3 and Web-stanford 4 were crawled from stanford.edu. Each node is a web domain and a directed link stands for a hyperlink between two nodes.…”

Section: Datasetsmentioning

confidence: 99%

“…The top-k RWR proximity query retrieves the k nodes with the highest proximity from a given query node q in a graph. This problem has been investigated previously and efficient solutions have been proposed for it (e.g., [11,3,10]). …”

Section: Introductionmentioning

confidence: 99%

Reverse top-k search using random walk with restart

Mamoulis

2014

Proc. VLDB Endow.

View full text Add to dashboard Cite

With the increasing popularity of social networks, large volumes of graph data are becoming available. Large graphs are also derived by structure extraction from relational, text, or scientific data (e.g., relational tuple networks, citation graphs, ontology networks, protein-protein interaction graphs). Node-to-node proximity is the key building block for many graph-based applications that search or analyze the data. Among various proximity measures, random walk with restart (RWR) is widely adopted because of its ability to consider the global structure of the whole network. Although RWR-based similarity search has been well studied before, there is no prior work on reverse top-k proximity search in graphs based on RWR. We discuss the applicability of this query and show that its direct evaluation using existing methods on RWR-based similarity search has very high computational and storage demands. To address this issue, we propose an indexing technique, paired with an on-line reverse top-k search algorithm. Our experiments show that our technique is efficient and has manageable storage requirements even when applied on very large graphs.

show abstract

A Simple Study of Pleasing Parallelism on Multicore Computers

Ren

Gleich

2020

Parallel Algorithms in Computational Science and Engineering

View full text Add to dashboard Cite

Quick Detection of Top-k Personalized PageRank Lists

Cited by 44 publications

References 19 publications

Reducing seed noise in personalized PageRank

Reducing seed noise in personalized PageRank

Reverse top-k search using random walk with restart

A Simple Study of Pleasing Parallelism on Multicore Computers

Contact Info

Product

Resources

About