Online data collection has become indispensable to the social sciences, polling, marketing, and corporate research. However, in recent years, online data collection has been inundated with low quality data. Low quality data threatens the validity of online research and, at times, invalidates entire studies. It is often assumed that random, inconsistent, and fraudulent data in online surveys comes from ‘bots.’ But little is known about whether bad data is caused by bots or ill-intentioned or inattentive humans. We examined this issue on Mechanical Turk (MTurk), a popular online data collection platform. In the summer of 2018, researchers noticed a sharp increase in the number of data quality problems on MTurk, problems that were commonly attributed to bots. Despite this assumption, few studies have directly examined whether problematic data on MTurk are from bots or inattentive humans, even though identifying the source of bad data has important implications for creating the right solutions. Using CloudResearch’s data quality tools to identify problematic participants in 2018 and 2020, we provide evidence that much of the data quality problems on MTurk can be tied to fraudulent users from outside of the U.S. who pose as American workers. Hence, our evidence strongly suggests that the source of low quality data is real humans, not bots. We additionally present evidence that these fraudulent users are behind data quality problems on other platforms.
Thousands of readily downloadable county-level data sets offer untapped potential for linking geo-social influences to individual-level human behavior. In this study we describe a methodology for county-level sampling of online participants, allowing us to link the self-reported behavior of N = 1084 online respondents to contemporaneous county-level data on COVID-19 infection rate density. Using this approach, we show that infection rate density predicts person-level self-reported face mask wearing beyond multiple other demographic and attitudinal covariates. Using the present effort as a demonstration project, we describe the underlying sampling methodology and discuss the wider range of potential applications.
Despite the absence of many traditional barriers to gender equality, there continues to be a gender pay gap in new job economies (i.e., the “gig economy” or “platform work”). Taking a novel approach to the study of the gender pay gap, we use a completely gender-blind online work setting to examine the effect of a covert source of gender inequality: differential pay expectations. Our findings reveal that women’s lower pay expectations lead to lower earnings. Crucially, these differential pay expectations appear to be shaped by income disparities in the traditional job economy. This research provides important new insight into the endurance of the gender pay gap, suggesting that structural inequities can carry over to new economies in subtle, yet powerful ways.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.