2017
DOI: 10.2139/ssrn.2844155
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Big Data in the Social Sciences with Crowdsourcing: Data Augmentation Practices, Techniques, and Opportunities

Abstract: The importance of big data is a contested topic among social scientists. Proponents claim it will fuel a research revolution, but skeptics challenge it as unreliably measured and decontextualized, with limited utility for accurately answering social science research questions. We argue that social scientists need effective tools to quantify big data's measurement error and expand the contextual information associated with it. Standard research efforts in many fields already pursue these goals through data augm… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
1
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 51 publications
0
4
0
Order By: Relevance
“…In Limitations. The classification performance reported in our study needs to be interpreted in the light of the well-known challenges of data collection via crowdsourcing, including data validity, data quality, and participant selection bias (Afshinnekoo et al, 2016;Khare et al, 2015;Porter et al, 2020). The COUGHVID database does not allow to verify the COVID-19 status of the participants, as the participants were not asked to provide a copy or confirmation of their positive or negative COVID-19 test.…”
Section: Discussionmentioning
confidence: 98%
“…In Limitations. The classification performance reported in our study needs to be interpreted in the light of the well-known challenges of data collection via crowdsourcing, including data validity, data quality, and participant selection bias (Afshinnekoo et al, 2016;Khare et al, 2015;Porter et al, 2020). The COUGHVID database does not allow to verify the COVID-19 status of the participants, as the participants were not asked to provide a copy or confirmation of their positive or negative COVID-19 test.…”
Section: Discussionmentioning
confidence: 98%
“…The deception of the audit study may allow us to document discrimination but a similar scenario presented as a survey experiment may allow us to explore potential mechanisms with the right questions. Moreover, the rise of Amazon's Mechanical Turk (MTurk) makes collecting survey experiment data relatively quick and cheap (Campbell and Gaddis forthcoming;Porter, Verdery, and Gaddis 2017). In ongoing work combing an audit with a survey experiment, I find that roommate discrimination against many different racial and ethnic groups is driven by issues of cultural fit.…”
Section: Limitations Of and Ways To Improve Correspondence Auditsmentioning
confidence: 99%
“…These tasks are often part of data-intensive processes and lengthy supply chains, feeding activities as varied as digitization of archives, market research, management of back-office operations and most importantly, development of artificial intelligence [Gray and Suri 2017]. Recent applications of machine learning -on which smart technologies, autonomous vehicles and virtual assistants are all based -rely on the creation and maintenance of large databases that need to be annotated, refined, labeled and more generally, augmented [Porter et al 2017]. Microwork is used, first, to prepare, categorize and qualify information for automatic learning algorithms; and second, to assess their performance and if necessary, to make corrections.…”
Section: Introductionmentioning
confidence: 99%