Background
Smartphones are increasingly used in health research. They provide a continuous connection between participants and researchers to monitor long-term health trajectories of large populations at a fraction of the cost of traditional research studies. However, despite the potential of using smartphones in remote research, there is an urgent need to develop effective strategies to reach, recruit, and retain the target populations in a representative and equitable manner.
Objective
We aimed to investigate the impact of combining different recruitment and incentive distribution approaches used in remote research on cohort characteristics and long-term retention. The real-world factors significantly impacting active and passive data collection were also evaluated.
Methods
We conducted a secondary data analysis of participant recruitment and retention using data from a large remote observation study aimed at understanding real-world factors linked to cold, influenza, and the impact of traumatic brain injury on daily functioning. We conducted recruitment in 2 phases between March 15, 2020, and January 4, 2022. Over 10,000 smartphone owners in the United States were recruited to provide 12 weeks of daily surveys and smartphone-based passive-sensing data. Using multivariate statistics, we investigated the potential impact of different recruitment and incentive distribution approaches on cohort characteristics. Survival analysis was used to assess the effects of sociodemographic characteristics on participant retention across the 2 recruitment phases. Associations between passive data-sharing patterns and demographic characteristics of the cohort were evaluated using logistic regression.
Results
We analyzed over 330,000 days of engagement data collected from 10,000 participants. Our key findings are as follows: first, the overall characteristics of participants recruited using digital advertisements on social media and news media differed significantly from those of participants recruited using crowdsourcing platforms (Prolific and Amazon Mechanical Turk; P<.001). Second, participant retention in the study varied significantly across study phases, recruitment sources, and socioeconomic and demographic factors (P<.001). Third, notable differences in passive data collection were associated with device type (Android vs iOS) and participants’ sociodemographic characteristics. Black or African American participants were significantly less likely to share passive sensor data streams than non-Hispanic White participants (odds ratio 0.44-0.49, 95% CI 0.35-0.61; P<.001). Fourth, participants were more likely to adhere to baseline surveys if the surveys were administered immediately after enrollment. Fifth, technical glitches could significantly impact real-world data collection in remote settings, which can severely impact generation of reliable evidence.
Conclusions
Our findings highlight several factors, such as recruitment platforms, incentive distribution frequency, the timing of baseline surveys, device heterogeneity, and technical glitches in data collection infrastructure, that could impact remote long-term data collection. Combined together, these empirical findings could help inform best practices for monitoring anomalies during real-world data collection and for recruiting and retaining target populations in a representative and equitable manner.