Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software.
Aim Oceanic islands represent a special challenge to historical biogeographers because dispersal is typically the dominant process while most existing methods are based on vicariance. Here, we describe a new Bayesian approach to island biogeography that estimates island carrying capacities and dispersal rates based on simple Markov models of biogeographical processes. This is done in the context of simultaneous analysis of phylogenetic and distributional data across groups, accommodating phylogenetic uncertainty and making parameter estimates more robust. We test our models on an empirical data set of published phylogenies of Canary Island organisms to examine overall dispersal rates and correlation of rates with explanatory factors such as geographic proximity and area size.Location Oceanic archipelagos with special reference to the Atlantic Canary Islands.Methods The Canary Islands were divided into three island-groups, corresponding to the main magmatism periods in the formation of the archipelago, while non-Canarian distributions were grouped into a fourth 'mainland-island'. Dispersal between island groups, which were assumed constant through time, was modelled as a homogeneous, time-reversible Markov process, analogous to the standard models of DNA evolution. The stationary state frequencies in these models reflect the relative carrying capacity of the islands, while the exchangeability (rate) parameters reflect the relative dispersal rates between islands. We examined models of increasing complexity: Jukes-Cantor (JC), Equal-in, and General Time Reversible (GTR), with or without the assumption of stepping-stone dispersal. The data consisted of 13 Canarian phylogenies: 954 individuals representing 393 taxonomic (morphological) entities. Each group was allowed to evolve under its own DNA model, with the island-model shared across groups. Posterior distributions on island model parameters were estimated using Markov Chain Monte Carlo (MCMC) sampling, as implemented in MrBayes 4.0, and Bayes Factors were used to compare models.Results The Equal-in step, the GTR, and the GTR step dispersal models showed the best fit to the data. In the Equal-in and GTR models, the largest carrying capacity was estimated for the mainland, followed by the central islands and the western islands, with the eastern islands having the smallest carrying capacity. The relative dispersal rate was highest between the central and eastern islands, and between the central and western islands. The exchange with the mainland was rare in comparison. INTRODUCT IONAfter being relegated to a simple 'footnote acknowledgment' (Lynch, 1989) in biogeographical papers for a long time, dispersal is again receiving increased attention as a fundamental process explaining the distribution of organisms (de Queiroz, 2005;McGlone, 2005;Riddle, 2005;Cowie & Holland, 2006). This trend can be observed in the large number of phylogeny-oriented articles published since 2004 in this journal and in Systematic Biology with the word 'dispersal' in their abst...
The main limiting factor in Bayesian MCMC analysis of phylogeny is typically the efficiency with which topology proposals sample tree space. Here we evaluate the performance of seven different proposal mechanisms, including most of those used in current Bayesian phylogenetics software. We sampled 12 empirical nucleotide data sets--ranging in size from 27 to 71 taxa and from 378 to 2,520 sites--under difficult conditions: short runs, no Metropolis-coupling, and an oversimplified substitution model producing difficult tree spaces (Jukes Cantor with equal site rates). Convergence was assessed by comparison to reference samples obtained from multiple Metropolis-coupled runs. We find that proposals producing topology changes as a side effect of branch length changes (LOCAL and Continuous Change) consistently perform worse than those involving stochastic branch rearrangements (nearest neighbor interchange, subtree pruning and regrafting, tree bisection and reconnection, or subtree swapping). Among the latter, moves that use an extension mechanism to mix local with more distant rearrangements show better overall performance than those involving only local or only random rearrangements. Moves with only local rearrangements tend to mix well but have long burn-in periods, whereas moves with random rearrangements often show the reverse pattern. Combinations of moves tend to perform better than single moves. The time to convergence can be shortened considerably by starting with a good tree, but this comes at the cost of compromising convergence diagnostics based on overdispersed starting points. Our results have important implications for developers of Bayesian MCMC implementations and for the large group of users of Bayesian phylogenetics software.
Mossel and Vigoda (Reports, 30 September 2005, p. 2207) show that nearest neighbor interchange transitions, commonly used in phylogenetic Markov chain Monte Carlo (MCMC) algorithms, perform poorly on mixtures of dissimilar trees. However, the conditions leading to their results are artificial. Standard MCMC convergence diagnostics would detect the problem in real data, and correction of the model misspecification would solve it.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.