BackgroundMedian construction is at the heart of several approaches to gene-order phylogeny. It has been observed that the solution to a median problem is generally not unique, and that alternate solutions may be quite different. Another concern has to do with a tendency for medians to fall on or near one of the three input orders, and hence to contain no information about the other two.ResultsWe conjecture that as gene orders become more random with respect to each other, and as the number of genes increases, the breakpoint median for circular unichromosomal genomes, in both the unsigned and signed cases, tends to approach one of the input genomes, the "corners" in terms of the distance normalized by the number of genes. Moreover, there are alternate solutions that approach each of the other inputs, so that the average distance between solutions is very large. We confirm these claims through simulations, and extend the results to medians of more than three genomes.ConclusionsThis effect also introduces serious biases into the medians of less scrambled genomes. It prompts a reconsideration of the role of the median in gene order phylogeny. Fortunately, for triples of finite length genomes, a small proportion of the median solutions escape the tendency towards the corners, and these are relatively close to each other. This suggests that a focused search for these solutions, though they represent a decreasing minority as genome length increases, is a way out of the pathological tendency we have described.
We provide a computationally realistic mathematical framework for the NP-hard problem of the multichromosomal breakpoint median for linear genomes that can be used in constructing phylogenies. A novel approach is provided that can handle signed, unsigned, and partially signed cases of the multichromosomal breakpoint median problem. Our method provides an avenue for incorporating biological assumptions (whenever available) such as the number of chromosomes in the ancestor, and thus it can be tailored to obtain a more biologically relevant picture of the median. We demonstrate the usefulness of our method by performing an empirical study on both simulated and real data with a comparison to other methods.
We provide a computationally realistic mathematical framework for the NP-hard problem of the multichromosomal breakpoint median for linear genomes that can be used in constructing phylogenies. A novel approach is provided that can handle both signed and unsigned cases of the multichromosomal breakpoint median problem. Our method provides an avenue for incorporating biological assumptions (whenever available) such as the number of chromosomes in the ancestor, and thus, it can be tailored to obtain a more biologically-relevant picture of the median. We demonstrate the usefulness of our method by performing an empirical study on both simulated and real data with a comparison to other methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.