2014
DOI: 10.48550/arxiv.1402.0928
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Framework for Evaluation of Composite Memento Temporal Coherence

Abstract: Most archived HTML pages embed other web resources, such as images and stylesheets. Playback of the archived web pages typically provides only the capture date (or Memento-Datetime) of the root resource and not the Memento-Datetime of the embedded resources. In the course of our research, we have discovered that the Memento-Datetime of embedded resources can be up to several years in the future or past, relative to the Memento-Datetime of the embedding root resource. We introduce a framework for assessing temp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
2

Relationship

4
2

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 6 publications
(17 reference statements)
0
8
0
Order By: Relevance
“…We used the Squidwarc headless crawler [11] to load each URI-M (including executing JavaScript to ensure loading all embedded resources) and download the contents into a WARC file [16]. Saving the data in WARC files allowed us to record all HTTP response headers and content for all of the resources that made up the composite memento [1].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We used the Squidwarc headless crawler [11] to load each URI-M (including executing JavaScript to ensure loading all embedded resources) and download the contents into a WARC file [16]. Saving the data in WARC files allowed us to record all HTTP response headers and content for all of the resources that made up the composite memento [1].…”
Section: Methodsmentioning
confidence: 99%
“…Web archives are established with the objective of providing permanent access to archived web pages, or mementos. 1 Mementos should be accessible in web archives even after the corresponding live web page is no longer available. The Uniform Resource Identifier (URI) [12] of the archived web page should not change over time, otherwise this defeats the purpose of using archived URIs.…”
Section: Introductionmentioning
confidence: 99%
“…A composite memento refers to all embedded resources that comprise a memento [4]. We modified the shell script (see Figure 8) written by Gwern Branwen [5].…”
Section: Generating a Hash Of A Composite Mementomentioning
confidence: 99%
“…When the representation is replayed from the archive, the JavaScript will execute and may issue Ajax requests for a resource that is on the live web, which leads to one of two possible outcomes: the live web "leaking" into the archive leading to an incorrect representation [13], or missing embedded resources (i.e., returns a 400 or 500 class HTTP response) in the archived resource leading to an incomplete representation, both of which result in reduced archival quality [10]. When an archived deferred representation loads embedded resources from the live web via leakage, it is a zombie resource, leaving the representation incorrect, and potentially prima facie violative [2]. We refer to the ease of archiving a Web resource as archivability, and have shown that resources that rely on JavaScript to construct their representations have lower archivability than resources that avoid JavaScript [12].…”
Section: Problem Descriptionmentioning
confidence: 99%