The population structure of Legionella pneumophila was investigated by analysing nucleotide sequences from six loci (flaA, pilE, asd, mip, mompS and proA) of 335 globally distributed isolates from clinical and environmental sources over a 29-year period (1977-2006). Data were obtained from unrelated isolates from Europe (n5270), Japan (n531), Canada (n57), the USA (n524) and Australia (n51). The country of origin of two strains was unknown. Analysis of these isolates indicated significant linkage disequilibrium between the six loci. Application of six sequence-based recombination detection tests did not reveal evidence of recombination, but estimates of rates of recombination and mutation made by a seventh test suggested that recombination could have occurred at a rate similar to, but probably lower than, that of mutation.Genealogies inferred under models with and without recombination were congruent with each other, providing no definitive evidence regarding recombination, and were in agreement with sequence clusters identified by graph methods. Further evidence supporting the distinct nature of two of the three subspecies of L. pneumophila, subsp. fraseri and subsp. pascullei, was also found. The ratios of non-synonymous to synonymous nucleotide polymorphisms for each of the allele sets were examined and revealed that the putative virulence loci mompS and pilE are under diversifying pressure, while the allelic regions of three other loci linked to virulence (flaA, proA and mip) do not appear to be.
INTRODUCTIONFollowing the recognition of the aetiological agent of Legionnaires' disease in 1977(McDade et al., 1977, phenotypic and genotypic analyses of Legionella pneumophila have mainly focused on the short-term epidemiology of the bacterium. In this scenario, the aim has been to demonstrate potential environmental sources of infection, thereby allowing timely intervention, in order to prevent further infection. Many such methods have been developed and applied with the purpose of discriminating between epidemiologically unrelated strains of L. pneumophila, particularly those belonging to serogroup (sg) 1 (Edelstein et al., 1986;Fry et al., 1999; Saunders et al., 1990;Schoonmaker et al., 1992;van Ketel et al., 1984).The utility of the multi-locus sequence typing (MLST) approach, in which sequences from five to 10 housekeeping genes are determined, in the identification of major bacterial lineages associated with invasive disease was first described in 1998 (Enright & Spratt, 1998;Maiden et al., 1998). Since then, this robust and portable technique (or variations of it) has been applied to the study of genetic diversity and clonal expansion, and to the long-term epidemiological analysis of microbial populations (see http://pubmlst.org/ and http://www.mlst.net/) (Giske et al., 2006;Paraskevopoulos et al., 2006;Vassileva et al., 2006).Recently, a similar approach, termed sequence-based typing (SBT), developed by members of the European Working Group for Legionella Infections (EWGLI), was applied to the epidemiolo...