Recent theoretical work has demonstrated that Neighbor Joining applied to concatenated DNA sequences is a statistically consistent method of species tree reconstruction. This brief note compares the accuracy of this approach to other popular statistically consistent species tree reconstruction algorithms including ASTRAL-II Neighbor Joining using average gene-tree internode distances (NJst) and SVD-Quartets+PAUP*, as well as concatenation using maximum likelihood (RaxML). We find that the faster Neighbor Joining, applied to concatenated sequences, is among the most effective of these methods for accurate species tree reconstruction.
BackgroundFirst proposed by Cavender and Felsenstein, and Lake, invariant based algorithms for phylogenetic reconstruction were widely dismissed by practicing biologists because invariants were perceived to have limited accuracy in constructing trees based on DNA sequences of reasonable length. Recent developments by algebraic geometers have led to the construction of lists of invariants which have been demonstrated to be more accurate on small sequences, but were limited in that they could only be used for trees with small numbers of taxa. We have developed and tested an invariant based quartet puzzling algorithm which is accurate and efficient for biologically reasonable data sets.ResultsWe found that our algorithm outperforms Maximum Likelihood based quartet puzzling on data sets simulated with low to medium evolutionary rates. For faster rates of evolution, invariant based quartet puzzling is reasonable but less effective than maximum likelihood based puzzling.ConclusionsThis is a proof of concept algorithm which is not intended to replace existing reconstruction algorithms. Rather, the conclusion is that when seeking solutions to a new wave of phylogenetic problems (super tree algorithms, gene vs. species tree, mixture models), invariant based methods should be considered. This article demonstrates that invariants are a practical, reasonable and flexible source for reconstruction techniques.
We apply classical quartet techniques to the problem of phylogenetic decisiveness and find a value k such that all collections of at least k quartets are decisive. Moreover, we prove that this bound is optimal and give a lower-bound on the probability that a collection of quartets is decisive.
Given a group-based Markov model on a tree, one can compute the vertex representation of a polytope describing a toric variety associated to the algebraic statistical model. In the case of Z 2 or Z 2 × Z 2 , these polytopes have applications in the field of phylogenetics. We provide a half-space representation for the m-claw tree where G = Z 2 × Z 2 , which corresponds to the Kimura-3 model of evolution.
Abstract-Quartet trees displayed by larger phylogenetic trees have long been used as inputs for species tree and supertree reconstruction. Computational constraints prevent the use of all displayed quartets in many practical problems with large numbers of taxa. We introduce the notion of an Efficient Quartet System (EQS) to represent a phylogenetic tree with a subset of the quartets displayed by the tree. We show mathematically that the set of quartets obtained from a tree via an EQS contains all of the combinatorial information of the tree itself. Using performance tests on simulated datasets, we also demonstrate that using an EQS to reduce the number of quartets in both summary method pipelines for species tree inference as well as methods for supertree inference results in only small reductions in accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.