The analysis of contextual information in search engine query logs is an important, yet difficult task. Users submit few queries, and search multiple topics sometimes with closely related context. Identification of topic changes within a search session is an important branch of contextual information analysis. The purpose of this study is to propose a topic identification algorithm using neural networks. A sample from the Excite data log i s selected to train the neural network and then the neural network is used to identify topic changes in the data log. As a result, 76% of topic shifts and 92% of topic continuations are identified correctly.
Purpose -This study aims to propose an artificial neural network to identify automatically topic changes in a user session by using the statistical characteristics of queries, such as time intervals and query reformulation patterns. Design/methodology/approach -A sample data log from the Norwegian search engine FAST (currently owned by Overture) is selected to train the neural network and then the neural network is used to identify topic changes in the data log. Findings -A total of 98.4 percent of topic shifts and 86.6 percent of topic continuations were estimated correctly. Originality/value -Content analysis of search engine user queries is an important task, since successful exploitation of the content of queries can result in the design of efficient information retrieval algorithms for search engines, which can offer custom-tailored services to the web user. Identification of topic changes within a user search session is a key issue in the content analysis of search engine user queries.
Purpose -Content analysis of search engine user queries is an important task, since successful exploitation of the content of queries can result in the design of efficient information retrieval algorithms of search engines, which can offer custom-tailored services to the web user. Identification of topic changes within a user search session is a key issue in content analysis of search engine user queries. The purpose of this study is to address these issues. Design/methodology/approach -This study applies genetic algorithms and Dempster-Shafer theory, proposed by He et al., to automatically identify topic changes in a user session by using statistical characteristics of queries, such as time intervals and query reformulation patterns. A sample data log from the Norwegian search engine FAST (currently owned by overture) is selected to apply Dempster-Shafer theory and genetic algorithms for identifying topic changes in the data log. Findings -As a result, 97.7 percent of topic shifts and 87.2 percent of topic continuations were estimated correctly. The findings are consistent with the previous application of the Dempster-Shafer theory and genetic algorithms on a different search engine data log. This finding could be implied as an indication that content-ignorant topic identification, using query patterns and time intervals, is a promising line of research. Originality/value -Studies an important dimension of user behavior in information retrieval.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.