A Forecasting Model for the Pages Crawled by Search Engine Crawlers at a Web Site

Jose, Jeeva; Lal, P. Sojan

doi:10.5120/11639-7122

Cited by 1 publication

(2 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are several works that mentions about the search engine crawler behavior. A forecasting model is proposed for the number of pages crawled by search engine crawlers at a web site [3]. Sun et al has conducted a large scale study of robots.txt [2].…”

Section: Background Literaturementioning

confidence: 99%

“…There is open source software available like Google Analytics which measures the number of visitors, duration of the visits, the demographic from which the visitor comes etc. But it cannot identify search engine visits because Google Analytics track users with the help of JavaScript and search engine crawlers do not enable the JavaScript embedded in web pages when the crawlers visit the web sites [3]. The search engine crawlers initially access the robots.txt file which specifies the Robot Exclusion Protocol.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Application of ARIMA(1,1,0) Model for Predicting Time Delay of Search Engine Crawlers

Jose¹,

Lal²

2013

Self Cite

View full text Add to dashboard Cite

World Wide Web is growing at a tremendous rate in terms of the number of visitors and number of web pages. Search engine crawlers are highly automated programs that periodically visit the web and index web pages. The behavior of search engines could be used in analyzing server load, quality of search engines, dynamics of search engine crawlers, ethics of search engines etc. The more the number of visits of a crawler to a web site, the more it contributes to the workload. The time delay between two consecutive visits of a crawler determines the dynamicity of the crawlers. The ARIMA(1,1,0) Model in time series analysis works well with the forecasting of the time delay between the visits of search crawlers at web sites. We considered 5 search engine crawlers, all of which could be modeled using ARIMA(1,1,0).The results of this study is useful in analyzing the server load.

show abstract

Section: Background Literaturementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%