OrthoAlign, an algorithm for the gene order alignment problem (alignment of orthologs), accounting for most genome-wide evolutionary events such as duplications, losses, rearrangements, and substitutions, was presented. OrthoAlign was used in a phylogenetic framework to infer the evolution of transfer RNA repertoires of 50 fully sequenced bacteria in the Bacillus genus. A prevalence of gene duplications and losses over rearrangement events was observed. The average rate of duplications inferred in Bacillus was 24 times lower than the one reported in Escherichia coli, whereas the average rates of losses and inversions were both 12 times lower. These rates were extremely low, suggesting a strong selective pressure acting on tRNA gene repertoires in Bacillus. An exhaustive analysis of the type, location, distribution, and length of evolutionary events was provided, together with ancestral configurations. OrthoAlign can be downloaded at: http://www.iro.umontreal.ca/~mabrouk/.
Abstract. Recently, an Alignment approach for the comparison of two genomes, based on an evolutionary model restricted to Duplications and Losses, has been presented. An exact linear programming algorithm has been developed and successfully applied to the Transfer RNA (tRNA) repertoire in Bacteria, leading to interesting observation on tRNA shift of identity. Here, we explore a direct dynamic programming approach for the Duplication-Loss Alignment of two genomes, which proceeds in two steps: (1) (The Dynamic Programming step) Outputs a best candidate alignment between the two genomes and (2) (Minimum Label Alignment problem) Finds an evolutionary scenario of minimum duplication-loss cost that is in agreement with the alignment. We show that the Minimum Label Alignment is APX-hard, even if the number of occurrences of a gene inside a genome is bounded by 5. We then develop a heuristic which is a thousands of times faster than the linear programming algorithm and exhibits a high degree of accuracy on simulated datasets. The heuristic has been implemented in JAVA and is available on request.
We relate the comparison of gene orders to an alignment problem. Our evolutionary model accounts for both rearrangement and content-modifying events. We present a heuristic based on dynamic programming for the inference of the median of three genomes and apply it in a phylogenetic framework. multiOrthoAlign is shown accurate on simulated and real datasets, and shown to significantly improve the running-time of DupLoCut, an "almost" exact algorithm based on linear programming, developed recently for the same problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.