2014
DOI: 10.1609/icwsm.v8i1.14517
|View full text |Cite
|
Sign up to set email alerts
|

Big Questions for Social Media Big Data: Representativeness, Validity and Other Methodological Pitfalls

Abstract: Large-scale databases of human activity in social media have captured scientific and policy attention, producing a flood of research and discussion. This paper considers methodological and conceptual challenges for this emergent field, with special attention to the validity and representativeness of social media big data analyses. Persistent issues include the over-emphasis of a single platform, Twitter, sampling biases arising from selection by hashtags, and vague and unrepresentative sampling frames. The soc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
100
0
7

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 343 publications
(108 citation statements)
references
References 23 publications
1
100
0
7
Order By: Relevance
“…Social and cultural psychologists view these as three dimensions of the self that virtually all people construct to some degree, but until recently, they have examined self-construal within ethnically or nationally defined cultures (Cross, Hardin, and Gercek-Swing 2011). In the powerful online social media culture that has the potential to catalyze social and political movements (Tufekci 2014), we use self-construal theory to gain insight into how individuals define and monitor themselves within this culture. These theories offer insight into the findings described below.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Social and cultural psychologists view these as three dimensions of the self that virtually all people construct to some degree, but until recently, they have examined self-construal within ethnically or nationally defined cultures (Cross, Hardin, and Gercek-Swing 2011). In the powerful online social media culture that has the potential to catalyze social and political movements (Tufekci 2014), we use self-construal theory to gain insight into how individuals define and monitor themselves within this culture. These theories offer insight into the findings described below.…”
Section: Discussionmentioning
confidence: 99%
“…To address the problem of representativeness and sample selection bias (Tufekci 2014), we designed a longitudinal study of Twitter profile snapshots. First, using Twitter's Streaming API, we captured 3,423,287 tweets on September 28, 2017 (approximately 1% of that day's tweets).…”
Section: Data Collection and Methodsmentioning
confidence: 99%
“…Het maatschappelijk debat is daarmee niet één op één hetzelfde als de perceptie van mensen, zoals die bijvoorbeeld met andere analyses in kaart wordt gebracht. Allereerst is de gemiddelde persoon die actief is op social media niet representatief voor 'de gemiddelde Nederlander' (Van der Veer et al, 2018;Tufekci, 2014). Social media berichten zijn bovendien niet per definitie reflecties van persoonlijk percepties.…”
Section: Het Maatschappelijk Debat Als Vertrekpuntunclassified
“…Hierbij wordt onder andere een recent onderzoek van Vrij Nederland en Nieuwsuur aangehaald waarin wordt gesteld dat "de wisselwerking tussen sociale een traditionele media een volstrekt vertekend beeld oplevert waarbij ontwikkelingen een eigen dynamiek krijgen en een zichzelf versterkend effect ontstaat" 19 . Uit wetenschappelijk onderzoek is al langer bekend dat reacties op social media niet per se representatief zijn voor opvattingen die leven in de bredere samenleving, omdat vooral een bepaalde groep mensen zich in bepaalde debatten mengt (Tufekci, 2014). Het onderzoek van Vrij Nederland en Nieuwsuur gaat specifiek in op het Nederlandse debat en laat zien dat de invloed van een kleine groep boze burgers aan zowel de rechtse als linkse kant van het politieke spectrum onevenredig groot is als we kijken naar de aard van bepaalde debatten.…”
Section: Van Onderstroom Naar Hypeunclassified
“…Even if the "golden age" of API-driven computational social science and social computing research had not closed in the shadow of privacy scandals, it was nevertheless characterized by enormous inefficiencies in data collection and inequalities in access (Manovich 2011;Puschmann 2019), ethically-suspect methods andimplications (boyd 2016;Tufekci 2014;Olteanu et al 2019), a lack of concern for data sharing or reproducibility (Borgman 2012;Weller and Kinder-Kurlanda 2016), and failures to validate constructs or generalize to off-platform behavior (Ekbia et al 2015;Howison, Wiggins, and Crowston 2011;Japec et al 2015). Facebook's and Twitter's changes in data access were significant, however the enclosure of previously open big social data sources is not ubiquitous among platform providers (Boyle 2017;Hess and Ostrom 2003;Hunter 2003).…”
Section: Introductionmentioning
confidence: 99%