A task scheduling strategy based on weighted round robin for distributed crawler

Ge, Dajie; Ding, Zhijun; Ji, Hongfei

doi:10.1002/cpe.3701

Cited by 4 publications

(3 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Randomized rounding and linear and dynamic programming algorithms are used to solve the multi-objective problem [4]. The most popular rounding algorithm is the Round Robin Algorithm [7], and its modified versions: Weighted Round-Robin [9], Fair Round-Robin [22] and Adaptive Round Robin [17].…”

Section: Introductionmentioning

confidence: 99%

Towards Secure Non-Deterministic Meta-Scheduling For Clouds

Jakóbik¹,

Grzonka²,

Kołodziej³

et al. 2016

ECMS 2016 Proceedings Edited by Thorsten Claus, Frank Herrmann, Michael Manitz, Oliver Rose

View full text Add to dashboard Cite

Task scheduling in large-scale distributed High Performance Computing (HPC) systems environments remains challenging research and engineering problem. There is a need of development of novel advanced scheduling techniques in order to optimise the resource utilisation. In this work, we develop the Agent Supported Non-Deterministic Meta Scheduler for cloud environments. This scheduling model is a simple combination of intelligent agent-based monitoring model for cloud system and security-aware cloud scheduler. In our model, scheduling, monitoring and reporting are provided in nondeterministic time intervals. An empirical case study using a FastFlow task farm was presented. It has demonstrates the effectiveness of the proposed solution.

show abstract

Section: Introductionmentioning

confidence: 99%

Towards Secure Non-Deterministic Meta-Scheduling For Clouds

Jakóbik¹,

Grzonka²,

Kołodziej³

et al. 2016

ECMS 2016 Proceedings Edited by Thorsten Claus, Frank Herrmann, Michael Manitz, Oliver Rose

View full text Add to dashboard Cite

show abstract

“…Due to the explosive growth of the Internet information content, web search engines are becoming increasingly important as the main means of locating relevant information. A complete search engine is comprised of different parts like web crawler system, page index system, page sort system and user interface [1]. Till now, a lot of scholars are carrying out the relevant research works about web search engine.…”

Section: Introductionmentioning

confidence: 99%

A Distributed Web Crawler Model based on Cloud Computing

Yu¹,

Li²,

Zhang

2016

Proceedings of the 2nd Information Technology and Mechatronics Engineering Conference (ITOEC 2016TOEC 2016)

View full text Add to dashboard Cite

With the rapid development of the network, distributed Web Crawler was introduced for fetching the massive web pages. However, the traditional distributed Web Crawler has disadvantages in load balancing between different nodes. In addition, the number of fetching web pages had not grown up linearly in the case of extended crawling nodes. This paper proposes a distributed web crawler model which runs on the Hadoop platform. The characteristics of Hadoop guarantees the scalability of the crawler model proposed by this paper. At the same time, the crawler model makes good use of HBase to guarantee the storage service of massive web context data. This paper also proposed a method of load balancing which is based on the feedback of crawling nodes. The crawler model has been proved to have good performance in load balancing and node extension.

show abstract

“…The continuing expansion of the Internet has presented ever greater challenges to crawler services used to collect information for indexing. ‘A Task Scheduling Strategy based on Weighted Round‐Robin for Distributed Crawler’ presents an implementation of a multithread distributed crawler which is scalable and fault tolerant. The paper includes experiments which indicate that the system exhibits good load balancing performance.…”

Section: Introductionmentioning

confidence: 99%