2013
DOI: 10.5121/ijwsc.2013.4102
|View full text |Cite
|
Sign up to set email alerts
|

Speeding Up the Web Crawling Process on a Multi-Core Processor Using Virtualization

Abstract: ABSTRACT

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
3
0

Year Published

2013
2013
2014
2014

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…The basic information acquisition Web search engine tool is becoming increasingly important because of the explosion in size and increasing demand of users for finding the information [1][2] [4].The Web search engines are information retrieval software systems that help in finding the information stored on the internet by taking input query words, and retrieving the information based on the matching criteria. Some search engines mine data available in databases or open directories.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The basic information acquisition Web search engine tool is becoming increasingly important because of the explosion in size and increasing demand of users for finding the information [1][2] [4].The Web search engines are information retrieval software systems that help in finding the information stored on the internet by taking input query words, and retrieving the information based on the matching criteria. Some search engines mine data available in databases or open directories.…”
Section: Introductionmentioning
confidence: 99%
“…First, it should have a good crawling strategy, i.e., a strategy for deciding which pages to download next. Second, it needs to have highly optimized system architecture, i.e., robust against crashes, manageable, and considerate of resources and web servers [1].The performance of the crawling process has been improved by using a parallel web crawler instead of a batch crawler. But, the existing parallel crawler as a single point coordinator with high chances of data redundancy leading to crawling of the same URLs multiple times thus affects the performance [3].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation