2018
DOI: 10.1007/978-3-030-00834-5_13
|View full text |Cite
|
Sign up to set email alerts
|

Multi-SpaM: A Maximum-Likelihood Approach to Phylogeny Reconstruction Using Multiple Spaced-Word Matches and Quartet Trees

Abstract: Motivation: Word-based or 'alignment-free' methods for phylogeny reconstruction are much faster than traditional approaches, but they are generally less accurate. Most of these methods calculate pairwise distances for a set of input sequences, for example from word frequencies, from so-called spaced-word matches or from the average length of common substrings. Results: In this paper, we propose the first word-based approach to tree reconstruction that is based on multiple sequence comparison and Maximum Likeli… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
3

Relationship

5
2

Authors

Journals

citations
Cited by 14 publications
(16 citation statements)
references
References 56 publications
0
16
0
Order By: Relevance
“…The best-performing tools for the plant data set are co-phylog [21], mash [9] and Multi-SpaM [23], all of which almost perfectly recovered the reference tree topology of the plant species (with an nRF = 0.09 for all three programs). In each of the trees produced by these programs, there was exactly one species placed at an incorrect position compared to its position in the reference tree, namely, in the speciation order in the Brassicaceae family for co-phylog (Additional file 2: Figure S4), mash (Additional file 2: Figure S5) and for Multi-SpaM, the last of which placed Carica papaya outside the Brassicales order (Additional file 2: Figure S6).…”
Section: Assembled Genomesmentioning
confidence: 86%
See 3 more Smart Citations
“…The best-performing tools for the plant data set are co-phylog [21], mash [9] and Multi-SpaM [23], all of which almost perfectly recovered the reference tree topology of the plant species (with an nRF = 0.09 for all three programs). In each of the trees produced by these programs, there was exactly one species placed at an incorrect position compared to its position in the reference tree, namely, in the speciation order in the Brassicaceae family for co-phylog (Additional file 2: Figure S4), mash (Additional file 2: Figure S5) and for Multi-SpaM, the last of which placed Carica papaya outside the Brassicales order (Additional file 2: Figure S6).…”
Section: Assembled Genomesmentioning
confidence: 86%
“…The sequences of 25 whole mitochondrial genomes of fish species from the suborder Labroidei and the species tree were taken from Fischer et al [48]. The set of 29 E. coli genome sequences was originally compiled by Yin and Jin [21] and has been used in the past by other groups to evaluate AF programs [22,23,68]. Finally, the set of 14 plant genomes is from Hatje et al [69].…”
Section: Data Setsmentioning
confidence: 99%
See 2 more Smart Citations
“…Most of these approaches calculate heuristic measures of sequence (dis-)similarity that are difficult to interpret. At the same time, alignment-free methods have been proposed that can accurately estimate phylogenetic distances between sequences based on stochastic models of DNA or protein evolution, using the length of common substrings [22,35] or so-called micro alignments [54,21,30,29,12].…”
Section: Introductionmentioning
confidence: 99%