We study the evolution of transcription factor-binding sites in prokaryotes, using an empirically grounded model with point mutations and genetic drift. Selection acts on the site sequence via its binding affinity to the corresponding transcription factor. Calibrating the model with populations of functional binding sites, we verify this form of selection and show that typical sites are under substantial selection pressure for functionality: for cAMP response protein sites in Escherichia coli, the product of fitness difference and effective population size takes values 2N⌬F of order 10. We apply this model to cross-species comparisons of binding sites in bacteria and obtain a prediction method for binding sites that uses evolutionary information in a quantitative way. At the same time, this method predicts the functional histories of orthologous sites in a phylogeny, evaluating the likelihood for conservation or loss or gain of function during evolution. We have performed, as an example, a cross-species analysis of E. coli, Salmonella typhimurium, and Yersinia pseudotuberculosis. Detailed lists of predicted sites and their functional phylogenies are available.R egulatory interactions between genes are believed to provide an important mode of evolution, which accounts for a substantial part of the differentiation between species (1). This is reflected by the sequence variability of regulatory DNA: there is ample case evidence of compensatory evolution at conserved function but also of rapid functional changes even between closely related species (2). Lacking a quantitative model of regulatory evolution, however, alignments of regulatory sequences and predictions of their functionality have proven notoriously difficult.A large body of existing work has focused on the identification of transcription factor-binding sites as the main functional elements of regulatory DNA. For factors with known binding specificity (given in the form of a position weight matrix), putative binding sites are identified from their conservation in cross-species comparisons. Different measures of conservation have been introduced, which involve, e.g., the sequence similarity of aligned loci or their independent high scoring in all species compared (3-7). These methods are powerful prediction tools for binding sites. From an evolutionary point of view, however, the conservation criteria are heuristic. Hence, it is difficult to quantify the statistical significance of the results, which depends on the number and evolutionary distance of the species compared. Sequence conservation tends to be too restrictive in cases where substantial sequence variation is compatible with the position weight matrix, in particular for distant species. Simple sequence similarity measures implicitly assume neutral evolution, whereas independent scoring of orthologous sites ignores the evolutionary link between the species altogether. Most importantly, none of these conservation measures allows a consistent statistical treatment of functional innovations in the e...