Practical Speedup of Bayesian Inference of Species Phylogenies by Restricting the Space of Gene Trees

Wang, Yaxuan; Ogilvie, Huw A.; Nakhleh, Luay

doi:10.1101/770784

Cited by 2 publications

(4 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…by using divide-and-conquer to break a large dataset into subsets or constraining the search space (e.g. [20,58,96,97]). However, Bayesian methods produce distributions from which point estimates can be obtained, and these distributions have significant additional value since they enable uncertainty quantification.…”

Section: Discussionmentioning

confidence: 99%

“…For example, there are new methods for large-scale ML tree estimation (e.g. Very Fast Tree [113]), new techniques to speed up co-estimation of gene trees and species trees [96,114], and even divide-and-conquer approaches to phylogenetic network estimation [115]. This continued effort to develop methods that are highly accurate and scalable leads us to the optimistic prediction that the next 5–10 years will result in new scalable methods to estimate accurate alignments, trees and even phylogenetic networks, and that these methods will enable biologists to make discoveries on the large and ultra-large phylogenomic datasets that they assemble.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Recent progress on methods for estimating and updating large phylogenies

Zaharias

Warnow

2022

Phil. Trans. R. Soc. B

View full text Add to dashboard Cite

With the increased availability of sequence data and even of fully sequenced and assembled genomes, phylogeny estimation of very large trees (even of hundreds of thousands of sequences) is now a goal for some biologists. Yet, the construction of these phylogenies is a complex pipeline presenting analytical and computational challenges, especially when the number of sequences is very large. In the past few years, new methods have been developed that aim to enable highly accurate phylogeny estimations on these large datasets, including divide-and-conquer techniques for multiple sequence alignment and/or tree estimation, methods that can estimate species trees from multi-locus datasets while addressing heterogeneity due to biological processes (e.g. incomplete lineage sorting and gene duplication and loss), and methods to add sequences into large gene trees or species trees. Here we present some of these recent advances and discuss opportunities for future improvements. This article is part of a discussion meeting issue ‘Genomic population structures of microbial pathogens’.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Recent progress on methods for estimating and updating large phylogenies

Zaharias

Warnow

2022

Phil. Trans. R. Soc. B

View full text Add to dashboard Cite

show abstract

“…Bayesian methods, such as MrBayes (Ronquist and Huelsenbeck, 2003), are well established in the research community and have been shown to provide highly accurate point estimates of alignments, gene trees, and species trees; however, most Bayesian methods use MCMC (Markov Chain Monte Carlo) and are computationally intensive on large datasets since convergence to the stationary distribution is required for high confidence in an accurate result. Some progress has been made on improving the scalability of these point estimations using Bayesian methods, e.g., by using divide-and-conquer to break a large dataset into subsets or constraining the search space (e.g., Zimmermann et al (2014); Nute and Warnow (2016); Wang et al (2020); Gupta et al (2021)). However, Bayesian methods produce distributions from which point estimates can be obtained, and these distributions have significant additional value since they enable uncertainty quantification.…”

Section: Discussionmentioning

confidence: 99%

“…This study did not discuss all the recent advances in large-scale alignment and tree estimation, and some of these may provide even better scalability and accuracy. For example, there are new methods for large-scale maximum likelihood tree estimation (e.g., Very Fast Tree (Piñeiro et al, 2020)), new techniques to speed up co-estimation of gene trees and species trees (Wang and Nakhleh, 2018;Wang et al, 2020), and even divide-and-conquer approaches to phylogenetic network estimation (Zhu et al, 2019a). This continued effort to develop methods that are highly accurate and scalable leads us to the optimistic prediction that the next 5 to 10 years will result in new scalable methods to estimate accurate alignments, trees, and even phylogenetic networks, and that these methods will enable biologists to make discoveries on the large and ultra-large phylogenomic datasets that they assemble.…”

Section: Discussionmentioning

confidence: 99%

<strong></strong> Recent Progress on Methods for Estimating and Updating Large Phylogenies

Zaharias¹,

Warnow²

2022

Preprint

View full text Add to dashboard Cite

With the increased availability of sequence data and even of fully sequenced and assembled genomes, phylogeny estimation of very large trees (even of hundreds of thousands of sequences) is now a goal for some biologists. Yet, the construction of these phylogenies is a complex pipeline presenting analytical and computational challenges, especially when the number of sequences is very large. In the last few years, new methods have been developed that aim to enable highly accurate phylogeny estimations on these large datasets, including divide-and-conquer techniques for multiple sequence alignment and/or tree estimation, methods that can estimate species trees from multi-locus datasets while addressing heterogeneity due to biological processes (e.g., incomplete lineage sorting and gene duplication and loss), and methods to add sequences into large gene trees or species trees. Here we present some of these recent advances and discuss opportunities for future improvements.

show abstract

Practical Speedup of Bayesian Inference of Species Phylogenies by Restricting the Space of Gene Trees

Cited by 2 publications

References 56 publications

Recent progress on methods for estimating and updating large phylogenies

Recent progress on methods for estimating and updating large phylogenies

<strong></strong> Recent Progress on Methods for Estimating and Updating Large Phylogenies

Contact Info

Product

Resources

About