Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012
DOI: 10.1145/2348283.2348318
|View full text |Cite
|
Sign up to set email alerts
|

Index maintenance for time-travel text search

Abstract: Time-travel text search enriches standard text search by temporal predicates, so that users of web archives can easily retrieve document versions that are considered relevant to a given keyword query and existed during a given time interval. Different index structures have been proposed to efficiently support time-travel text search. None of them, however, can easily be updated as the Web evolves and new document versions are added to the web archive.In this work, we describe a novel index structure that effic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0
2

Year Published

2013
2013
2021
2021

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 24 publications
(13 citation statements)
references
References 26 publications
0
11
0
2
Order By: Relevance
“…Imagine the cost of rebuilding the indexes of tens of large-scale web collections each time a new crawl ends. Alternatives for rebuilding the indexes when using the term-based partition exist, but are also more complex and less efficient than using the document-based partition [3]. The decision of partitioning the index first or after time, presents itself as a trade-off between speed and space.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Imagine the cost of rebuilding the indexes of tens of large-scale web collections each time a new crawl ends. Alternatives for rebuilding the indexes when using the term-based partition exist, but are also more complex and less efficient than using the document-based partition [3]. The decision of partitioning the index first or after time, presents itself as a trade-off between speed and space.…”
Section: Discussionmentioning
confidence: 99%
“…The fast development of Information and Communication Technology had a great impact on this growth. In the last decade, the world population with access to the Internet grew more than 1 000% in some regions 3 . Computer-based devices and mobile phones with Internet connectivity are now about 5 billion 4 , much of which are equipped with technology that empowers people to easily create data.…”
Section: Introductionmentioning
confidence: 99%
“…For a compact index structure, two mappings assign ids to tags and websites, which are used exclusively in all other indexes and mappings (i.e., IdTagMapping, IdUrlMapping). Even though there is much room for improvement and optimization of temporal indexes [8,9], the rather simple mappings, which can easily be constructed using a distributed data processing platform like Hadoop, already go a long way.…”
Section: Index Structuresmentioning
confidence: 99%
“…The focus in all of these, however, is on retrieving individual documents as opposed to analyzing a large collection of them. Indexing versioned document collections, such as web archives, has also received ample attention [7,12,22] -no existing work has looked into indexing temporal expressions contained in documents.…”
Section: Related Workmentioning
confidence: 99%