Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval 2011
DOI: 10.1145/2009916.2009991
|View full text |Cite
|
Sign up to set email alerts
|

Temporal index sharding for space-time efficiency in archive search

Abstract: Time-travel queries that couple temporal constraints with keyword queries are useful in searching large-scale archives of time-evolving content such as the web archives or wikis. Typical approaches for efficient evaluation of these queries involve slicing either the entire collection [20] or individual index lists [10] along the time-axis. Both these methods are not satisfactory since they sacrifice compactness of index for processing efficiency making them either too big or, otherwise, too slow.We present a n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 16 publications
(21 citation statements)
references
References 18 publications
0
21
0
Order By: Relevance
“…The approach proposed in [6] is to shard (or horizontally partition) each term list along the document identifiers instead of time. Entries in a term list are thus distributed over disjoint sub-lists called shards, and entries in a shard are ordered according to their start times t s .…”
Section: Previous Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The approach proposed in [6] is to shard (or horizontally partition) each term list along the document identifiers instead of time. Entries in a term list are thus distributed over disjoint sub-lists called shards, and entries in a shard are ordered according to their start times t s .…”
Section: Previous Methodsmentioning
confidence: 99%
“…An optimal greedy algorithm for creating this partitioning is given in [6]; an example of temporal sharding for the term list from Fig. 1, is shown in Fig.…”
Section: Previous Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…This important subfield of IR has the goal to improve search effectiveness by exploiting temporal information in documents and queries [11,12]. The temporal dimension leads to new challenges in query understanding [13], retrieval models [14,15] as well as temporal indexing [16,17]. However, most temporal indexing approaches treat documents as static texts with a certain validity, which does not account for the dynamics in Web archives as described above.…”
Section: Web Archive Searchmentioning
confidence: 99%
“…Users can thus combine a keyword query (e.g., financial crisis) with a time interval (e.g., [2006,2008]) to retrieve all document versions from the archive that are considered relevant to the given keywords and existed during the given time interval. Different index structures [3,4,7] have been proposed to efficiently support time-travel text search, incurring controllable overhead either in terms of index size or response time when compared to standard text search.…”
Section: Introductionmentioning
confidence: 99%