2017
DOI: 10.48550/arxiv.1712.03140
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Difficulties of Timestamping Archived Web Pages

Abstract: We show that state-of-the-art services for creating trusted timestamps in blockchain-based networks do not adequately allow for timestamping of web pages. They accept data by value (e.g., images and text), but not by reference (e.g., URIs of web pages). Also, we discuss difficulties in repeatedly generating the same cryptographic hash value of an archived web page. We then introduce several requirements to be fulfilled in order to produce repeatable hash values for archived web pages.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…We also found that some URI-Ms redirected to mementos with memento-datetimes after 2016. Grusky et al did not encounter these problems in 2016 likely due to issues relating to web archive playback, as studied by Ainsworth et al [38] and Aturban et al [39], which we worked around by discarding mementos that redirected beyond 2016. After resolving these issues, shown in Table III, we were left with 277,724 mementos of news articles to evaluate.…”
Section: Methodsmentioning
confidence: 98%
“…We also found that some URI-Ms redirected to mementos with memento-datetimes after 2016. Grusky et al did not encounter these problems in 2016 likely due to issues relating to web archive playback, as studied by Ainsworth et al [38] and Aturban et al [39], which we worked around by discarding mementos that redirected beyond 2016. After resolving these issues, shown in Table III, we were left with 277,724 mementos of news articles to evaluate.…”
Section: Methodsmentioning
confidence: 98%
“…Taking into account all of these archive-related issues, it becomes a challenging problem to distinguish between legitimate changes by archives and malicious changes. In our technical report [9] we provide several recommendations of how to generate repeatable xity information. Kuhn et al [32] de ne a trusty URI as a URI that contains a cryptographic hash value of the content it identi es as shown in Figure 2.…”
Section: Background and Related Workmentioning
confidence: 99%
“…(8) Name the le using its content-addressable hash. (9) Compress the block le to e ciently archive it. (10) Publish the compressed block le on a URI that contains its hash.…”
Section: Block Disseminationmentioning
confidence: 99%
See 1 more Smart Citation
“…We describe the steps we took to create a data set of 16,627 mementos of 3,698 unique live web URIs (Uniform Resource Identifiers) from 17 public web archives. We use this collection in our study of identifying changes and transformations in the content of mementos over time (our preliminary work can be found in [6,7,8]).…”
Section: Introductionmentioning
confidence: 99%