ObjectivesTo illustrate the utility of unsupervised machine learning compared with traditional methods of analysis by identifying archetypes within the population that may be more or less likely to get the COVID-19 vaccine.DesignA longitudinal prospective cohort study (n=2009 households) with recurring phone surveys from 2020 to 2022 to assess COVID-19 knowledge, attitudes and practices. Vaccine questions were added in 2021 (n=1117) and 2022 (n=1121) rounds.SettingFive informal settlements in Nairobi, Kenya.ParticipantsIndividuals from 2009 households included.Outcome measures and analysisRespondents were asked about COVID-19 vaccine acceptance (February 2021) and vaccine uptake (March 2022). Three distinct clusters were estimated using K-Means clustering and analysed against vaccine acceptance and vaccine uptake outcomes using regression forest analysis.ResultsDespite higher educational attainment and fewer concerns regarding the pandemic, young adults (cluster 3) were less likely to intend to get the vaccine compared with cluster 1 (41.5% vs 55.3%, respectively; p<0.01). Despite believing certain COVID-19 myths, older adults with larger households and more fears regarding economic impacts of the pandemic (cluster 1) were more likely to ultimately to get vaccinated than cluster 3 (78% vs 66.4%; p<0.01), potentially due to employment requirements. Middle-aged women who are married or divorced and reported higher risk of gender-based violence in the home (cluster 2) were more likely than young adults (cluster 3) to report wanting to get the vaccine (50.5% vs 41.5%; p=0.014) but not more likely to have gotten it (69.3% vs 66.4%; p=0.41), indicating potential gaps in access and broader need for social support for this group.ConclusionsFindings suggest this methodology can be a useful tool to characterise populations, with utility for improving targeted policy, programmes and behavioural messaging to promote uptake of healthy behaviours and ensure equitable distribution of prevention measures.