1Genetic surveillance of malaria parasites supports malaria control programmes, treatment 2 guidelines and elimination strategies. Surveillance studies often pose questions about malaria 3 parasite ancestry (e.g. about the spread of antimalarial resistance) and employ methods that 4 characterise parasite population structure. Many of the methods used to characterise struc-5 ture are algorithms developed in machine learning (ML) and depend on a genetic distance 6 matrix, e.g. principal coordinates analysis (PCoA) and hierarchical agglomerative clustering 7 (HAC). However, PCoA and HAC are sensitive to both the definition of genetic distance and 8 algorithmic specification. Importantly, neither algorithm generates an inferred malaria para-9 site ancestry. As such, PCoA and HAC can support (e.g. via exploratory visualisation and 10 hypothesis generation), but not answer comprehensively, key questions about malaria para-11 site ancestry. We illustrate the sensitivity of PCoA and HAC using 393 P. falciparum whole 12 genome sequences collected from Cambodia and neighbouring regions (where antimalarial re-13 sistance has emerged and spread recently) and we provide tentative guidance for the use of 14 PCoA and HAC in malaria parasite genetic epidemiology. This guidance includes a call for 15 fully transparent and reproducible analysis pipelines that feature (i) a clearly outlined scien-16 tific question; (ii) clear justification of methods used along with discussion of any inferential 17 limitations; (iii) publicly available genetic distance matrices; and (iv) sensitivity analyses. To 18 bridge the inferential disconnect between the output of the ML algorithms and the scientific 19 questions of interest, tailor-made statistical models are needed to infer malaria parasite an-20 cestry. In the absence of such models speculative reasoning should feature only as discussion 21 but not as results. 22 24 ular surveillance to understand how the parasites causing malaria are spreading, to track drug 25 resistance and identify its emergence, and to understand where to target interventions. Malaria 26 parasite genetic data have been accrued increasingly rapidly in recent years, but the analytical 27 methods for making sense of them lag behind. In particular, the methods used currently to in-28 fer malaria parasite ancestry, which is often central to the questions posed by studies of malaria 29 parasite genetic epidemiology, have significant limitations. An important goal in malaria parasite 30 genetic epidemiology is inference of the full ancestral recombination graph. However, there are no 31 scalable methods directly applicable to malaria parasites. Instead, the first step towards inferring 32 1 ancestry involves characterisation of contemporary population genetic structure using computa-33 tionally tractable methods. Many questions of clinical and public health relevance, for example, 34 interpreting reduced haplotype diversity as a selective sweep, rely on methods that first characterise 35 the underlying population structure w...