2017
DOI: 10.1093/sysbio/syw104
|View full text |Cite
|
Sign up to set email alerts
|

Homology-aware Phylogenomics at Gigabase Scales

Abstract: Obstacles to inferring species trees from whole genome data sets range from algorithmic and data management challenges to the wholesale discordance in evolutionary history found in different parts of a genome. Recent work that builds trees directly from genomes by parsing them into sets of small \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
13
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 66 publications
(114 reference statements)
0
13
0
Order By: Relevance
“…As a 'control' , we also considered the tree (O. punctata, (O. meridionalis, (O. barthii, O. rufipogon))), which replaces the problematic O. glumaepatula with another taxon in the AA clade, O. meridionalis, which is more distantly related to the Asian/African pair. Small alignment blocks sampled across the sequences of each chromosome were assembled using the program hakmer 115 , which identifies exactly matching syntenic single-copy k-mers (k = 32) found in all four taxa, extended by an ungapped 25-nt region up-and downstream of each k-mer. Typically, the final datasets included 10-15% of the original sequence data after this culling.…”
Section: Methodsmentioning
confidence: 99%
“…As a 'control' , we also considered the tree (O. punctata, (O. meridionalis, (O. barthii, O. rufipogon))), which replaces the problematic O. glumaepatula with another taxon in the AA clade, O. meridionalis, which is more distantly related to the Asian/African pair. Small alignment blocks sampled across the sequences of each chromosome were assembled using the program hakmer 115 , which identifies exactly matching syntenic single-copy k-mers (k = 32) found in all four taxa, extended by an ungapped 25-nt region up-and downstream of each k-mer. Typically, the final datasets included 10-15% of the original sequence data after this culling.…”
Section: Methodsmentioning
confidence: 99%
“…We further consider the sequence capture success and the congruence of the phylogenetic results obtained via two widely‐adopted approaches: de novo (Jones & Good, 2016; McCormack et al, 2013) and reference‐guided (Johnson et al, 2016) assembly of target/exon sequence reads followed by sequence annotation and alignment. We also tested the application of a promising, yet underutilized, phylogenomic approach based on homologous substrings of sequence ( k ‐ mer ) (Sanderson et al, 2017). Gene and species trees were generated by maximum likelihood in IQ‐TREE 2 (Minh, Schmidt, et al, 2020) with branch support estimated by ultrafast bootstrap, and topological variation measured by gene concordance and site concordance factors (Minh, Hahn, et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Here again, attempts have been made to using standard ML approaches such as support vector machines [123] to guide the comparison of tree shapes, for instance, [124], which can then be used in epidemiology [121], but estimating a phylogenetic tree has proved more challenging. In one notable exception, an alignment-free distance-based tree-reconstruction method was proposed [125], but its main legacy seems to be in the development of k-mers, or unaligned sequences chopped into words of length k, to reconstruct phylogenetic trees-in particular in the context of phylogenomics (phylogenetics at a genomics scale) [126,127]. To the best of our knowledge, nobody has ever tried, yet, to train a neural network or even a deep learning algorithm [128][129][130] on a database of phylogenetic trees with corresponding alignments such as TreeBASE [131] or PANDIT [132].…”
Section: Cutting Corners With Abc and Aimentioning
confidence: 99%