2019
DOI: 10.1007/s10502-019-09325-9
|View full text |Cite|
|
Sign up to set email alerts
|

Social media data archives in an API-driven world

Abstract: In this article, we explore the long-term preservation implications of application programming interfaces (APIs) which govern access to data extracted from social media platforms. We begin by introducing the preservation problems that arise when APIs are the primary way to extract data from platforms, and how tensions fit with existing models of archives and digital repository development. We then define a range of possible types of API users motivated to access social media data from platforms and consider ho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0
3

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 52 publications
(40 citation statements)
references
References 30 publications
0
29
0
3
Order By: Relevance
“…API-based research offers methodological comfort in providing researchers with structured data, a clear demarcation of the types of data that can be extracted, their volume and the restrictions on their use, which are specified in the platform’s terms of use (Venturini and Rogers, 2019). Although previous research proposed using APIs for archiving social media (Acker and Kreisberg, 2019; Littman et al., 2018; Lomborg, 2012), the majority of social media researchers have used API data for immediate analysis, rather than for long-term preservation, appraisal of sources or lending access to others.…”
Section: Counter-archiving As a Methodsmentioning
confidence: 99%
“…API-based research offers methodological comfort in providing researchers with structured data, a clear demarcation of the types of data that can be extracted, their volume and the restrictions on their use, which are specified in the platform’s terms of use (Venturini and Rogers, 2019). Although previous research proposed using APIs for archiving social media (Acker and Kreisberg, 2019; Littman et al., 2018; Lomborg, 2012), the majority of social media researchers have used API data for immediate analysis, rather than for long-term preservation, appraisal of sources or lending access to others.…”
Section: Counter-archiving As a Methodsmentioning
confidence: 99%
“…As early as 1992, a statistical content analysis demonstrated that a statistical approach could assist in archival description of the content of early social media implementations, and especially in surfacing latent patterns in that content [24]. In their examination of using platform APIs to identify social media content for preservation, Acker and Kriesberg cautioned that that this approach presents a new evidentiary and administrative challenge for archivists because it can result in decontextualizing social media data streams [1]. Most notably, the 2018 failure of the Library of Congress to be able to provide access to the entirety of its massive Twitter Archive has resulted in a number of more selective efforts to archive and provide access to Twitter content [6].…”
Section: Archiving Social Mediamentioning
confidence: 99%
“…The aggregated data is what comprises the COVID-19 Hate Speech Twitter Archive (CHSTA), which we release publicly on Github in order to encourage its use by other researchers. 1 Table 1 shows the number of tweets in CHSTA that were collected through each wave. Although the coverage of each wave differs by a few days, we find that wave 4 has significantly more tweets compared to that of other waves.…”
Section: The Covid-19 Hate Speech Twitter Archive (Chsta) 31 Data Scmentioning
confidence: 99%
“…However, sharing "officially" and transparently not only requires effort but also certainty about the legal and ethical limitations that apply for a specific dataset. There is hence little (but growing) evidence of official sharing (for an overview see Thomson, 2016;Acker and Kreisberg, 2019) but also an unmeasured "gray market" in which data is shared informally amongst researchers within the same group or field (Weller and Kinder-Kurlanda, 2015).…”
Section: Preserving Data and Secondary Usagementioning
confidence: 99%