2019
DOI: 10.1007/s41109-019-0201-9
|View full text |Cite
|
Sign up to set email alerts
|

Estimating PageRank deviations in crawled graphs

Abstract: Most real-world graphs collected from the Web like Web graphs and social network graphs are partially discovered or crawled. This leads to inaccurate estimates of graph properties based on link analysis such as PAGERANK. In this paper we focus on studying such deviations in ordering/ranking imposed by PAGERANK over crawled graphs. We first show that deviations in rankings induced by PAGERANK are indeed possible. We measure how much a ranking, induced by PAGERANK, on an input graph could deviate from the origin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…Although scientists have looked into graph properties of the web in general, both in static (Albert et al, 1999;Broder et al, 2000;Adamic et al, 2000;Suel and Yuan, 2001;Boldi and Vigna, 2004) and evolving graphs (Huberman and Adamic, 1999;Leskovec et al, 2005Leskovec et al, , 2007, we found that certain traits of web archives lead to new kinds of questions. For instance, as we show in Holzmann et al (2018Holzmann et al ( , 2019, the inherent incompleteness of archives can affect rankings produced by graph algorithms on web archive graphs.…”
Section: Open Challengesmentioning
confidence: 83%
“…Although scientists have looked into graph properties of the web in general, both in static (Albert et al, 1999;Broder et al, 2000;Adamic et al, 2000;Suel and Yuan, 2001;Boldi and Vigna, 2004) and evolving graphs (Huberman and Adamic, 1999;Leskovec et al, 2005Leskovec et al, , 2007, we found that certain traits of web archives lead to new kinds of questions. For instance, as we show in Holzmann et al (2018Holzmann et al ( , 2019, the inherent incompleteness of archives can affect rankings produced by graph algorithms on web archive graphs.…”
Section: Open Challengesmentioning
confidence: 83%
“…Our approach, therefore, allows us to attribute any decision to a small subset of the node neighborhood, hence increasing interpretability. We believe that our work can be extended to ML tasks in multiple domains like Web tasks [12,13], rankings [31,30] and tabular data [8].…”
Section: Discussionmentioning
confidence: 99%
“…In [14], the authors focused on the problem of the deviations in PageRank values caused by restricted crawling. Some further variation of traditional PageRank is proposed in [24] replacing the original transition matrix is replaced with one whose entries are based on the number of a node's N-step neighbours.…”
Section: Background and Related Work 21 Pagerank -A Measure Of Import...mentioning
confidence: 99%