Proceedings of the 29th on Hypertext and Social Media 2018
DOI: 10.1145/3209542.3209560
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrapping Web Archive Collections from Social Media

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
2
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…How similar are social media collections about events to web archive collections on the same topic? Nwala et al (Nwala et al, 2018a) analysed the text from web pages shared on Twitter, Storify, Reddit and the web archiving platform Archive-It. Their results show that web archive collections about events are very similar to social media collections.…”
Section: Analysis Of Social Mediamentioning
confidence: 99%
“…How similar are social media collections about events to web archive collections on the same topic? Nwala et al (Nwala et al, 2018a) analysed the text from web pages shared on Twitter, Storify, Reddit and the web archiving platform Archive-It. Their results show that web archive collections about events are very similar to social media collections.…”
Section: Analysis Of Social Mediamentioning
confidence: 99%
“…As a tweet can have at most 280 characters, this platform poses several challenges caused by the short content of messages. For example, there are studies extending traditional IR/NLP techniques designed for long documents such as news articles to fit short texts, e.g., identifying central topic model from tweet streams [48], summarizing tweets [22], retrieving opinions [21], detecting community [8], and building corpora [38,42,46]. In addition, Twitter contains not only texts but also unique features such as hashtags, followers and followees (i.e., Twitter users who follow or are followed by a particular user), and URLs.…”
Section: Twitter Analysismentioning
confidence: 99%
“…The problem of knowing what to collect from the web has also been treated in the digital library research community as a focused crawling problem. In focused crawling the goal is to collect content about particular topics (Risse et al, 2012), events (Klein, Balakireva, & Van de Sompel, 2018;Yang, Chitturi, Wilson, Magdy, & Fox, 2012 ), or to collect content that has a particular characteristic such as popularity (Page, Brin, Motwani, & Winograd, 1999), importance Baeza-Yates, Marin, Castillo, & Rodriguez (2005)] or social engagement (Gossen, Demidova, & Risse, 2015 ;Milligan, Ruest, & Lin, 2016;Nwala, Weigle, & Nelson, 2018 ). Generally speaking these approaches take the focus to be a topic, event, person, organization that can be qualified by the types of media (documents, audio, video).…”
Section: Digital Librariesmentioning
confidence: 99%