A domain specific crawler, as diverse from a general web search engine, focuses on a specific segment of web content. They are also called vertical or topical search engines. Common vertical search engines are meant for shopping, automotive industry, legal information, medical information, scholarly literature, and travel. Examples of vertical search engines are Trulia.com, Mocavo.com and Yelp. In contrast to genera lpurpose Web search engines, which attempt to index large portions of the World Wide Web using a web crawler, vertical search engines typically use a domain specific crawler that attempts to index only Web pages that are relevant to a pre-defined topic or set of topics. Vertical search offers several potential benefits over general search such as greater precision due to their limited scope, leverage domain knowledge including taxonomies and ontology and support of specific unique user tasks. This paper aims at analyzing the machine learning Techniques namely ANN, SVM and Hi-SVM being used for Web Page Classification and suggesting suitable improvements. Here a crawling framework has been designed and developed that allows flexible addition of new classifiers. This crawler has been used for classification of web content for few domains. The crawlers themselves are implemented as multithreaded objects that run concurrently. The results show that Hi-SVM is a better choice for guiding a topical crawler when compared to Support Vector Machine and Neural Network. The comparative analysis of the three classifier techniques namely ANN, SVM and Hi-SVM showed that the performance of Hi-SVM is most efficient.
Now a days the websites are available in bulk and a single search can give various different results. There still exist problem of getting results based on user importance in order to save time and complexity while searching. The personalized search built on user unique identification can solve the current problem to large extent. In this paper we have taken a unique personalization approach. We identify user and makes search according to user interest based on previous searches made by him. We present a personalized web search framework UIBP (USER IDENTIFICATION BASED PERSONALIZATION). The comparison of our model with others shows that our search agent will prove more user friendly as it will make the searching fast, easy and provide accurate results. Therefore it is an enhancement in the field of web mining.
Our existing society is totally dependent on web search to fulfill our daily requirements. Therefore millions of web pages are accessed every day. To fulfill user need number of websites and webpages are added .The growing size of web data results to the difficulty in attaining useful information with a minimum clicks. This results to the acquisition of personalization a major place in Web search. But the use of personalization breaches privacy in searching. Personalization with privacy is leading issue in current web environment. This paper aims at user satisfaction by using user identification based personalization approach in web search engine. Beside personalization the proposed model creates privacy during personalization. The proposed system will prove to be user friendly with less efforts and privacy concern.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.