2011 Proceedings IEEE INFOCOM 2011
DOI: 10.1109/infcom.2011.5935303
|View full text |Cite
|
Sign up to set email alerts
|

Agnostic topology-based spam avoidance in large-scale web crawls

Abstract: Abstract-With the proliferation of web spam and questionable content with virtually infinite auto-generated structure, largescale web crawlers now require low-complexity ranking methods to effectively budget their limited resources and allocate the majority of bandwidth to reputable sites. To shed light on Internet-wide spam avoidance, we study the domain-level graph from a 6.3B-page web crawl and compare several agnostic topology-based ranking algorithms on this dataset. We first propose a new methodology for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 27 publications
(51 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?