2013
DOI: 10.5120/10440-5125
|View full text |Cite
|
Sign up to set email alerts
|

Web Crawler: A Review

Abstract: Information Retrieval deals with searching and retrieving information within the documents and it also searches the online databases and internet. Web crawler is defined as a program or software which traverses the Web and downloads web documents in a methodical, automated manner. Based on the type of knowledge, web crawler is usually divided in three types of crawling techniques: General Purpose Crawling,

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0
4

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 86 publications
(40 citation statements)
references
References 22 publications
0
36
0
4
Order By: Relevance
“…Ref represents a survey of domain‐specific query forms that is related only to hidden Web. Ref also represents a review of crawler but it is short. To carry out this research, we framed four research questions to determine the current status of Web crawlers.…”
Section: Discussionmentioning
confidence: 99%
“…Ref represents a survey of domain‐specific query forms that is related only to hidden Web. Ref also represents a review of crawler but it is short. To carry out this research, we framed four research questions to determine the current status of Web crawlers.…”
Section: Discussionmentioning
confidence: 99%
“…A web crawler (also known as a spider) is "a software or a programmed script that browses the WWW in a systematic, automated manner" [58], and systematically downloads numerous webpages starting from a seed URL [9]. Web crawlers date back to the 1990s, where they were introduced when the WWW was invented.…”
Section: Web Crawlers and Crawling Techniquesmentioning
confidence: 99%
“…Later on, web crawlers that could efficiently download millions of webpages were built. We can consider the Internet to be a "directed graph" where each node represents a webpage, and the edges represent hyperlinks connecting these webpages [58]. Web crawlers traverse over this graph-like structure of the Internet, go to webpages, and download their content for indexing purposes.…”
Section: Web Crawlers and Crawling Techniquesmentioning
confidence: 99%
“…RESPONSE TIME This is the time taken for each query search to be completed by the engines; it can be measured using stop clock or as is displayed by some search engines Kausar, Dhaka and Singh (2013) RELATIVE RECALL This is the ability of a system to retrieve all or most of the relevant documents in the collection. The relative recall can be calculated using the following formulae;…”
Section: Precisionmentioning
confidence: 99%