The steiner problem in phylogeny is NP-complete

Foulds, L. R.; Graham, Ron

doi:10.1016/s0196-8858(82)80004-3

Cited by 364 publications

(196 citation statements)

References 10 publications

Supporting

Mentioning

196

Contrasting

Order By: Relevance

“…The number of candidate species joined in each iteration is calculated as in Foulds and Graham (1982) based on the Steiner's problem (Kumnorkaew et al 2004). The probability P m (i,j) of the ant m to join the species i and j is calculated as:…”

Section: Selection Of the Species And Parameter Values Of The Abpr Almentioning

confidence: 99%

Ant-Based Phylogenetic Reconstruction (ABPR): A new distance algorithm for phylogenetic estimation based on ant colony optimization

2008

View full text Add to dashboard Cite

We propose a new distance algorithm for phylogenetic estimation based on Ant Colony Optimization (ACO), named Ant-Based Phylogenetic Reconstruction (ABPR). ABPR joins two taxa iteratively based on evolutionary distance among sequences, while also accounting for the quality of the phylogenetic tree built according to the total length of the tree. Similar to optimization algorithms for phylogenetic estimation, the algorithm allows exploration of a larger set of nearly optimal solutions. We applied the algorithm to four empirical data sets of mitochondrial DNA ranging from 12 to 186 sequences, and from 898 to 16,608 base pairs, and covering taxonomic levels from populations to orders. We show that ABPR performs better than the commonly used Neighbor-Joining algorithm, except when sequences are too closely related (e.g., population-level sequences). The phylogenetic relationships recovered at and above species level by ABPR agree with conventional views. However, like other algorithms of phylogenetic estimation, the proposed algorithm failed to recover expected relationships when distances are too similar or when rates of evolution are very variable, leading to the problem of long-branch attraction. ABPR, as well as other ACO-based algorithms, is emerging as a fast and accurate alternative method of phylogenetic estimation for large data sets.

show abstract

Section: Selection Of the Species And Parameter Values Of The Abpr Almentioning

confidence: 99%

Ant-Based Phylogenetic Reconstruction (ABPR): A new distance algorithm for phylogenetic estimation based on ant colony optimization

2008

View full text Add to dashboard Cite

show abstract

“…Keywords: Phylogenetic tree, maximum parsimony, homoplasy, tree search Phylogenetic tree reconstruction methods based on optimization criteria (such as maximum parsimony or maximum likelihood) have long been known to be computationally intractable (NP-hard) (Foulds and Graham, 1982). However, on perfectly tree-like data (i.e.…”

Section: Introductionmentioning

confidence: 99%

“…This provides, for the first time, a rigorous way to test tree search algorithms on homoplasy-rich data, where we know in advance what the 'best' tree is. In this short note we consider just one search program (TNT) but show that it is able to locate the globally optimal tree correctly for 32,768 taxa, even though the characters in the dataset requires, on average, 1148 state-changes each to fit on this tree, and the number of characters is only 57.Keywords: Phylogenetic tree, maximum parsimony, homoplasy, tree search Phylogenetic tree reconstruction methods based on optimization criteria (such as maximum parsimony or maximum likelihood) have long been known to be computationally intractable (NP-hard) (Foulds and Graham, 1982). However, on perfectly tree-like data (i.e.…”

mentioning

confidence: 99%

Hide and seek: Placing and finding an optimal tree for thousands of homoplasy-rich sequences

Radel¹,

Sand²,

Steel³

2013

Molecular Phylogenetics and Evolution

View full text Add to dashboard Cite

Finding optimal evolutionary trees from sequence data is typically an intractable problem, and there is usually no way of knowing how close to optimal the best tree from some search truly is. The problem would seem to be particularly acute when we have many taxa and when that data has high levels of homoplasy, in which the individual characters require many changes to fit on the best tree. However, a recent mathematical result has provided a precise tool to generate a short number of high-homoplasy characters for any given tree, so that this tree is provably the optimal tree under the maximum parsimony criterion. This provides, for the first time, a rigorous way to test tree search algorithms on homoplasy-rich data, where we know in advance what the 'best' tree is. In this short note we consider just one search program (TNT) but show that it is able to locate the globally optimal tree correctly for 32,768 taxa, even though the characters in the dataset requires, on average, 1148 state-changes each to fit on this tree, and the number of characters is only 57.Keywords: Phylogenetic tree, maximum parsimony, homoplasy, tree search Phylogenetic tree reconstruction methods based on optimization criteria (such as maximum parsimony or maximum likelihood) have long been known to be computationally intractable (NP-hard) (Foulds and Graham, 1982). However, on perfectly tree-like data (i.e. long sequences with low homoplasy), these methods will generally find the optimal tree quickly, even for large datasets. Moreover, when data is largely tree-like, there are good theoretical and computational methods for finding an optimal tree under methods such as maximum parsimony, with an early result more than 30 years ago (Hendy et al., 1980), along with more recent developments (Blelloch et al., 2006;Holland et al., 2005).So far, it has not been clear whether such methods would be able to find the global 'optimal' tree for homoplasy-rich datasets with large numbers of taxa, particularly when the sequences are short. The traditional view (Sokal and Sneath, 1963) is that homoplasy tends to obscure tree signal, requiring more character data than homoplasy-free data to recover a tree, though contrary opinions that homoplasy can 'help' have also appeared (Kälersjö et al., 1999).A fundamental obstacle arises in trying to answer this question: One usually cannot guarantee in advance that any tree will be optimal for homoplasy-rich data without first searching exhaustively through tree space, and this precludes datasets involving hundreds (let alone thousands) of taxa. However, a recent mathematical result by Chai and Hous-

show abstract

“…Minimizing the length of a phylogeny is the problem of finding the most parsimonious tree, a well known NP-complete problem [7]. Researchers have thus focused on either sophisticated heuristics or solving optimally for special cases (e.g.…”

Section: Introductionmentioning

confidence: 99%

Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees

Sridhar

Dhamdhere

Blelloch

et al. 2006

Computational Science – ICCS 2006

View full text Add to dashboard Cite

Abstract.We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect phylogeny assumption by allowing at most a constant number q of additional mutations. In this paper, we develop an algorithm for constructing optimal phylogenies and provide empirical evidence of its performance. The algorithm runs in time O((72κ) q nm + nm 2 ) where n is the number of taxa, m is the number of characters and κ is the number of characters that share four gametes with some other character. This is fixed parameter tractable when q and κ are constants and significantly improves on the previous asymptotic bounds by reducing the exponent to q. Furthermore, the complexity of the previous work makes it impractical and in fact no known implementation of it exists. We implement our algorithm and demonstrate it on a selection of real data sets, showing that it substantially outperforms its worstcase bounds and yields far superior results to a commonly used heuristic method in at least one case. Our results therefore describe the first practical phylogenetic tree reconstruction algorithm that finds guaranteed optimal solutions while being easily implemented and computationally feasible for data sets of biologically meaningful size and complexity.

show abstract

The steiner problem in phylogeny is NP-complete

Cited by 364 publications

References 10 publications

Ant-Based Phylogenetic Reconstruction (ABPR): A new distance algorithm for phylogenetic estimation based on ant colony optimization

Ant-Based Phylogenetic Reconstruction (ABPR): A new distance algorithm for phylogenetic estimation based on ant colony optimization

Hide and seek: Placing and finding an optimal tree for thousands of homoplasy-rich sequences

Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees

Contact Info

Product

Resources

About