1999
DOI: 10.1145/331403.331405
|View full text |Cite
|
Sign up to set email alerts
|

Analysis of a very large web search engine query log

Abstract: In this paper we present an analysis of an AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents almost 285 million user sessions, each an attempt to fill a single information need. We present an analysis of individual queries, query duplication, and query sessions. We also present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the conjecture that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

41
774
6
12

Year Published

2001
2001
2015
2015

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 1,018 publications
(833 citation statements)
references
References 1 publication
41
774
6
12
Order By: Relevance
“…Studies of the Web search engine and the PubMed search log concurred with our usagelog analysis: A single term search is the most common, with three words maximum entered by typical users. 6 A PubMed study found that 22 percent of user queries were for known items rather than for a general subject, confirming our own log analysis findings that the majority of searches were for a particular source item. 7 Search-term analysis revealed that many of our users were entering partial article citations (e.g., author, date) in any query box expecting that article databases would be searched concurrently with the resource database.…”
supporting
confidence: 74%
“…Studies of the Web search engine and the PubMed search log concurred with our usagelog analysis: A single term search is the most common, with three words maximum entered by typical users. 6 A PubMed study found that 22 percent of user queries were for known items rather than for a general subject, confirming our own log analysis findings that the majority of searches were for a particular source item. 7 Search-term analysis revealed that many of our users were entering partial article citations (e.g., author, date) in any query box expecting that article databases would be searched concurrently with the resource database.…”
supporting
confidence: 74%
“…These analyses have provided valuable information characterising user queries and sessions, e.g. [14,10], and give a useful overview of search behaviour on the Web.…”
Section: Related Workmentioning
confidence: 99%
“…Co-occurrence is largely limited to phrases, such as "real estate", "red hat", "clip art", and "puerto rico". However, some words, mainly adjectives like "free", tend to co-occur with several other terms [622,374,372].…”
Section: Transactional Queriesmentioning
confidence: 99%