Objectives: This scoping review aims to clarify the definition and trajectory of citizen-led scientific research (so-called citizen science) within the healthcare domain, examine the degree of integration of machine learning (ML) and the participation levels of citizen scientists in health-related projects.
Materials and Methods: In January and September 2024 we conducted a comprehensive search in PubMed, Scopus, Web of Science, and EBSCOhost platform for peer-reviewed publications that combine citizen science and machine learning (ML) in healthcare. Articles were excluded if citizens were merely passive data providers or if only professional scientists were involved.
Results: Out of an initial 1,395 screened, 56 articles spanning from 2013 to 2024 met the inclusion criteria. The majority of research projects were conducted in the U.S. (n=20, 35.7%), followed by Germany (n=6, 10.7%), with Spain, Canada, and the UK each contributing three studies (5.4%). Data collection was the primary form of citizen scientist involvement (n=29, 51.8%), which included capturing images, sharing data online, and mailing samples. Data annotation was the next most common activity (n=15, 26.8%), followed by participation in ML model challenges (n=8, 14.3%) and decision-making contributions (n=3, 5.4%). Mosquitoes (n=10, 34.5%) and air pollution samples (n=7, 24.2%) were the main data objects collected by citizens for ML analysis. Classification tasks were the most prevalent ML method (n=30, 52.6%), with Convolutional Neural Networks being the most frequently used algorithm (n=13, 20%).
Discussion and Conclusions: Citizen science in healthcare is currently an American and European construct with growing expansion in Asia. Citizens are contributing data, and labeling data for ML methods, but only infrequently analyzing or leading studies. Projects that use “crowd-sourced” data and “citizen science” should be differentiated depending on the degree of involvement of citizens.