Comparative methods used to study patterns of evolutionary change in a continuous trait on a phylogeny range from Brownian motion processes to models where the trait is assumed to evolve according to an Ornstein-Uhlenbeck (OU) process. Although these models have proved useful in a variety of contexts, they still do not cover all the scenarios biologists want to examine. For models based on the OU process, model complexity is restricted in current implementations by assuming that the rate of stochastic motion and the strength of selection do not vary among selective regimes. Here, we expand the OU model of adaptive evolution to include models that variously relax the assumption of a constant rate and strength of selection. In its most general form, the methods described here can assign each selective regime a separate trait optimum, a rate of stochastic motion parameter, and a parameter for the strength of selection. We use simulations to show that our models can detect meaningful differences in the evolutionary process, especially with larger sample sizes. We also illustrate our method using an empirical example of genome size evolution within a large flowering plant clade. K E Y W O R D S : Brownian motion, comparative method, continuous characters Hansen model, Ornstein-Uhlenbeck.Single-rate Brownian motion works reasonably well as a model for evolution of traits. It models drift, drift-mutation balance, and even stabilizing selection toward a moving optimum (Hansen and Martins 1996). However, a single parameter model can certainly not explain the evolution of traits across all life. There have been extensions to the model, such as a single Ornstein-Uhlenbeck (OU) process that has a constant pull toward an optimum value, a multiple mean OU process with different possible means for different groups (Hansen 1997;Butler and King 2004), and multiple rate Brownian motion processes allowing different rates of evolution on different branches (O'Meara et al. 2006;Thomas et al. 2006). These models, while useful, still do not cover all the scenarios biologists want to examine. For example, existing models with a value toward which species are being pulled all have a fixed strength of pull over the entire history of the group. It is possible to allow the rate of stochastic motion to vary, or the value of the attractor to vary, but not for both to vary. Such restrictions on model complexity may make sense when phylogenies are limited to a few dozen taxa. However, in an era where phylogenies can have over 55,000 taxa (Smith et al. 2011), we may be so bold as to attempt to fit models that vary both rates and means of the evolutionary process. This article develops and implements such models. Hansen (1997) described a model where quantitative characters are assumed to evolve according to an OU process. The Hansen model, as it has become known, expresses the amount of change in a quantitative trait along each branch in a phylogeny and is given by the stochastic differential equation: Generalizing the Hansen Model(1) Equation ...
As computational work becomes more and more integral to many aspects of scientific research, computational reproducibility has become an issue of increasing importance to computer systems researchers and domain scientists alike. Though computational reproducibility seems more straight forward than replicating physical experiments, the complex and rapidly changing nature of computer environments makes being able to reproduce and extend such work a serious challenge. In this paper, I explore common reasons that code developed for one research project cannot be successfully executed or extended by subsequent researchers. I review current approaches to these issues, including virtual machines and workflow systems, and their limitations. I then examine how the popular emerging technology Docker combines several areas from systems research -such as operating system virtualization, cross-platform portability, modular re-usable elements, versioning, and a 'DevOps' philosophy, to address these challenges. I illustrate this with several examples of Docker use with a focus on the R statistical environment.
Phylogenetic comparative methods may fail to produce meaningful results when either the underlying model is inappropriate or the data contain insufficient information to inform the inference. The ability to measure the statistical power of these methods has become crucial to ensure that data quantity keeps pace with growing model complexity. Through simulations, we show that commonly applied model choice methods based on information criteria can have remarkably high error rates; this can be a problem because methods to estimate the uncertainty or power are not widely known or applied. Furthermore, the power of comparative methods can depend significantly on the structure of the data. We describe a Monte Carlo based method which addresses both of these challenges, and show how this approach both quantifies and substantially reduces errors relative to information criteria. The method also produces meaningful confidence intervals for model parameters. We illustrate how the power to distinguish different models, such as varying levels of selection, varies both with number of taxa and structure of the phylogeny. We provide an open-source implementation in the pmc (“Phylogenetic Monte Carlo”) package for the R programming language. We hope such power analysis becomes a routine part of model comparison in comparative methods.
This article introduces a package that provides interactive and programmatic access to the FishBase repository. This package allows interaction with data on over 30 000 fish species in the rich statistical computing environment, R. This direct, scriptable interface to FishBase data enables better discovery and integration essential for large-scale comparative analyses. This article provides several examples to illustrate how the package works, and how it can be integrated into phylogenetics packages such as ape and geiger.
Catastrophic regime shifts in complex natural systems may be averted through advanced detection. Recent work has provided a proof-of-principle that many systems approaching a catastrophic transition may be identified through the lens of early warning indicators such as rising variance or increased return times. Despite widespread appreciation of the difficulties and uncertainty involved in such forecasts, proposed methods hardly ever characterize their expected error rates. Without the benefits of replicates, controls or hindsight, applications of these approaches must quantify how reliable different indicators are in avoiding false alarms, and how sensitive they are to missing subtle warning signs. We propose a model-based approach to quantify this trade-off between reliability and sensitivity and allow comparisons between different indicators. We show these error rates can be quite severe for common indicators even under favourable assumptions, and also illustrate how a model-based indicator can improve this performance. We demonstrate how the performance of an early warning indicator varies in different datasets, and suggest that uncertainty quantification become a more central part of early warning predictions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.