2019
DOI: 10.1177/0049124119882481
|View full text |Cite
|
Sign up to set email alerts
|

Disambiguating and Specifying Social Actors in Big Data: Using Wikipedia as a Data Source for Demographic Information

Abstract: Despite the recent and ongoing progress in using text-mining tools to automatically analyze large text corpora, there remains significant potential to facilitate the study of social action in social science research. In this context, particularly the disambiguation (who is referred to in a text?) and specification (which demographic characteristics are present?) of social actors—currently a manual job—remains a challenge. This article demonstrates a reliable and accurate software architecture for social scient… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

2
2
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 65 publications
2
2
0
Order By: Relevance
“…Our findings are in line with studies showing that political information on Wikipedia is often factually correct 3 (Brown 2011;Göbel and Munzert 2021;Poschmann and Goldenstein 2019), and that Wikipedia can be used for extending political science measurement to a much larger universe of cases (Munzert 2018). Our results also tie in with studies demonstrating the validity of crowdsourcing information in political research (e.g., Sumner, Farris, and Holman 2020;Winter, Hughes, and Sanders 2020).…”
Section: Introductionsupporting
confidence: 91%
“…Our findings are in line with studies showing that political information on Wikipedia is often factually correct 3 (Brown 2011;Göbel and Munzert 2021;Poschmann and Goldenstein 2019), and that Wikipedia can be used for extending political science measurement to a much larger universe of cases (Munzert 2018). Our results also tie in with studies demonstrating the validity of crowdsourcing information in political research (e.g., Sumner, Farris, and Holman 2020;Winter, Hughes, and Sanders 2020).…”
Section: Introductionsupporting
confidence: 91%
“…Together, our results indicate that Wikipedia classifications allow for extracting left-right scores comparable to scores obtained via conventional expert coding methods. These findings are in line with studies showing that political information on Wikipedia is often factually correct 3 (Brown 2011, Göbel & Munzert 2019, Poschmann & Goldenstein 2019, and that Wikipedia can be used for extending political science measurement to a much larger universe of cases (Munzert 2018).…”
Section: Introductionsupporting
confidence: 88%
“…While extensive research has been conducted on demographics of large geo-based communities (Chambers 2020;Brass 1996) and topic-specific study cases (Poschmann and Goldenstein 2022;Sun and Peng 2021;Zhou et al 2020), to the best of our knowledge, this is the first work to address communities of interest, releasing data which covers 16 topics from 4 main domains, namely culture, geography, history & society, and STEM.…”
Section: Communitymentioning
confidence: 99%