This study proposes to use genetic algorithms for defining the topic boundaries in search of engine transaction logs. Users are interested in multiple topics during a search session, and genetic algorithms are used in this study to determine whether a search engine user has changed topics during a session. Sample data logs from FAST and Excite search engines were analyzed. The findings show that genetic algorithms are fairly successful in identifying topic continuations and shifts in search engine transaction logs.It is a challenge to develop effective information retrieval algorithms and interpret user information-seeking behavior, since people have different and changing information needs, and they utilize different informationsearching strategies to solve their information problems (Gremett 2006). One of the challenges in information-seeking behavior is new topic identification or session identification. New topic identification or session identification is discovering when the user has switched from one topic to another during a single search session (He, Goker, and Harper 2002). In order to find useful patterns in user sessions, it is necessary to group the queries on the transaction logs into clusters. After the query clusters have been identified, the common usage patterns of search engine users can be discovered by statistical tools (Huang, Yao, and An 2006).It is important to identify when the user has changed topics to design clustering algorithms and query recommendation algorithms. In addition, it is a very difficult problem to identify user sessions in common access computer centers, such as libraries and PC labs. When a new user accesses
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.