Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries 2011
DOI: 10.1145/1998076.1998100
|View full text |Cite
|
Sign up to set email alerts
|

How much of the web is archived?

Abstract: Although the Internet Archive's Wayback Machine is the largest and most well-known web archive, there have been a number of public web archives that have emerged in the last several years. With varying resources, audiences and collection development policies, these archives have varying levels of overlap with each other. While individual archives can be measured in terms of number of URIs, number of copies per URI, and intersection with other archives, to date there has been no answer to the question "How much… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
85
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 81 publications
(88 citation statements)
references
References 18 publications
2
85
0
1
Order By: Relevance
“…First, researchers using web archive data have a subset of the full web. Using Ainsworth et al's (2013) estimates of web pages they might have between 35% and 90% of the web. By constructing their sample of URLs from DMOZ, Delicious, Bitly, and Google, Ainsworth et al (2013) almost certainly examined the inclusion of more popular and prominent URLs (i.e.…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…First, researchers using web archive data have a subset of the full web. Using Ainsworth et al's (2013) estimates of web pages they might have between 35% and 90% of the web. By constructing their sample of URLs from DMOZ, Delicious, Bitly, and Google, Ainsworth et al (2013) almost certainly examined the inclusion of more popular and prominent URLs (i.e.…”
Section: Discussionmentioning
confidence: 99%
“…Using Ainsworth et al's (2013) estimates of web pages they might have between 35% and 90% of the web. By constructing their sample of URLs from DMOZ, Delicious, Bitly, and Google, Ainsworth et al (2013) almost certainly examined the inclusion of more popular and prominent URLs (i.e. the URLs included in DMOZ or added to Delicious are by definition more popular and prominent than the URLs that no one adds to these platforms).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations