2015
DOI: 10.1002/cpe.3701
|View full text |Cite
|
Sign up to set email alerts
|

A task scheduling strategy based on weighted round robin for distributed crawler

Abstract: Summary With the rapid development of the network, stand‐alone crawlers are finding hard to find and gather information. Distributed crawlers are gradually accepted to solve this problem. This paper proposes a task scheduling strategy based on weighted round robin for small‐scale distributed crawler with formula weights for the current node based on crawling efficiency, implements a distributed crawler system with multithreading support and deduplication which takes the algorithm as core, and discusses some po… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…Randomized rounding and linear and dynamic programming algorithms are used to solve the multi-objective problem [4]. The most popular rounding algorithm is the Round Robin Algorithm [7], and its modified versions: Weighted Round-Robin [9], Fair Round-Robin [22] and Adaptive Round Robin [17].…”
Section: Introductionmentioning
confidence: 99%
“…Randomized rounding and linear and dynamic programming algorithms are used to solve the multi-objective problem [4]. The most popular rounding algorithm is the Round Robin Algorithm [7], and its modified versions: Weighted Round-Robin [9], Fair Round-Robin [22] and Adaptive Round Robin [17].…”
Section: Introductionmentioning
confidence: 99%
“…Due to the explosive growth of the Internet information content, web search engines are becoming increasingly important as the main means of locating relevant information. A complete search engine is comprised of different parts like web crawler system, page index system, page sort system and user interface [1]. Till now, a lot of scholars are carrying out the relevant research works about web search engine.…”
Section: Introductionmentioning
confidence: 99%
“…The continuing expansion of the Internet has presented ever greater challenges to crawler services used to collect information for indexing. ‘A Task Scheduling Strategy based on Weighted Round‐Robin for Distributed Crawler’ presents an implementation of a multithread distributed crawler which is scalable and fault tolerant. The paper includes experiments which indicate that the system exhibits good load balancing performance.…”
Section: Introductionmentioning
confidence: 99%