2020
DOI: 10.2196/23021
|View full text |Cite
|
Sign up to set email alerts
|

Threats of Bots and Other Bad Actors to Data Quality Following Research Participant Recruitment Through Social Media: Cross-Sectional Questionnaire

Abstract: Background Recruitment of health research participants through social media is becoming more common. In the United States, 80% of adults use at least one social media platform. Social media platforms may allow researchers to reach potential participants efficiently. However, online research methods may be associated with unique threats to sample validity and data integrity. Limited research has described issues of data quality and authenticity associated with the recruitment of health research part… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
169
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 183 publications
(174 citation statements)
references
References 29 publications
5
169
0
Order By: Relevance
“…To ensure validity and integrity of data and reduce potential fraudulent responses encountered in social media recruitment, participants were first asked questions regarding eligibility to eliminate automated software or ''bots.'' 12 Additional steps to reduce fraudulent responses included prohibition of duplicate emails; removal of respondents whose survey completion was <5 minutes given average completion time of 17 minutes; and removal of respondents reporting ''highly improbable'' medical treatment patterns (e.g., patients who reported stage 1 colon cancer and reported receiving immunotherapy) as reviewed by a medical oncologist (A.B. ).…”
Section: Methodsmentioning
confidence: 99%
“…To ensure validity and integrity of data and reduce potential fraudulent responses encountered in social media recruitment, participants were first asked questions regarding eligibility to eliminate automated software or ''bots.'' 12 Additional steps to reduce fraudulent responses included prohibition of duplicate emails; removal of respondents whose survey completion was <5 minutes given average completion time of 17 minutes; and removal of respondents reporting ''highly improbable'' medical treatment patterns (e.g., patients who reported stage 1 colon cancer and reported receiving immunotherapy) as reviewed by a medical oncologist (A.B. ).…”
Section: Methodsmentioning
confidence: 99%
“…To check for understanding and refine the survey, the survey was piloted with 20 adults prior to data collection. Various strategies were used with the online survey to ensure data quality (e.g., inclusion of open-ended items, raffle incentive vs renumeration for all, embedded directive items; Pozzar et al, 2020). Sample inclusion required the participant to have at least one child.…”
Section: Methodsmentioning
confidence: 99%
“…In prior work using similar recruitment methods, (7,15) identi cation of bots or other sources of invalid records is of particular concern and several methods for such identi cation have been utilized. We importantly identi ed that even with survey platform tools (such as the 'Prevent Ballot Box Stu ng' option), additional attention to records' metadata is critical to identify residual entries submitted from bots.…”
Section: Discussionmentioning
confidence: 99%
“…However, clearing browser cookies, switching to a different web browser, using a different device, or using a browser in 'incognito' mode would all allow a participant to enter the survey again. As such, we additionally relied on embedded data to identify potential fraudulent entries for records attached to IP addresses that were duplicated in the data greater than four times; three of four instances were suspected to be the result of bots (fraudulent activity) (15) and discarded from the data. In the rst instance, one IP address (geotagged to a location in China) contributed 172 attempted survey entries, none of which progressed in the survey beyond the consent page, that were all submitted within a 24-minute window.…”
Section: Assessing Data Validitymentioning
confidence: 99%