Minimum spanning trees (MSTs) are frequently used in molecular epidemiology research to estimate relationships among individual strains or isolates. Nevertheless, there are significant caveats to MST algorithms that have been largely ignored in molecular epidemiology studies and that have the potential to confound or alter the interpretation of the results of those analyses. Specifically, (i) presenting a single, arbitrarily selected MST illustrates only one of potentially many equally optimal solutions, and (ii) statistical metrics are not used to assess the credibility of MST estimations. Here, we survey published MSTs previously used to infer microbial population structure in order to determine the effect of these factors. We propose a technique to estimate the number of alternative MSTs for a data set and find that multiple MSTs exist for each case in our survey. By implementing a bootstrapping metric to evaluate the reliability of alternative MST solutions, we discover that they encompass a wide range of credibility values. On the basis of these observations, we conclude that current approaches to studying population structure using MSTs are inadequate. We instead propose a systematic approach to MST estimation that bases analyses on the optimal computation of an input distance matrix, provides information about the number and configurations of alternative MSTs, and allows identification of the most credible MST or MSTs by using a bootstrapping metric. It is our hope this algorithm will become the new "gold standard" approach for analyzing MSTs for molecular epidemiology so that this generally useful computational approach can be used informatively and to its full potential.Although a classic problem of academic mathematics (10), minimum spanning trees (MSTs) have become an increasingly common tool for molecular epidemiology research. With a set of pairwise distances that describe the degree of dissimilarity among individuals, an MST represents a set of edges (connections) that link together nodes (individuals) by the shortest possible distance. In molecular epidemiology, this path is interpreted as the most likely chain of pathogen transmission. Given that MSTs are calculated from simple arithmetic distance matrices, they are particularly useful for examining relationships of organisms over short time scales, such as disease outbreaks or the short-range transmission of pathogens within communities, where not enough genetic diversity has accrued to permit the use of more mathematically sophisticated algorithms for inferring population structure, such as phylogenetic analysis (20) or model-based clustering algorithms (7).Despite their popularity, there are serious problems in applying MSTs to molecular epidemiology that are almost invariably overlooked. (i) Although a single MST is reported by virtually all algorithms, there are frequently multiple, equally optimal solutions to the MST problem. In any data set, there can exist several equally parsimonious paths if two or more edges have the same lengths. In ...