Species delimitation is the act of identifying species-level biological diversity. In recent years, the field has witnessed a dramatic increase in the number of methods available for delimiting species. However, most recent investigations only utilize a handful (i.e. 2-3) of the available methods, often for unstated reasons. Because the parameter space that is potentially relevant to species delimitation far exceeds the parameterization of any existing method, a given method necessarily makes a number of simplifying assumptions, any one of which could be violated in a particular system. We suggest that researchers should apply a wide range of species delimitation analyses to their data and place their trust in delimitations that are congruent across methods. Incongruence across the results from different methods is evidence of either a difference in the power to detect cryptic lineages across one or more of the approaches used to delimit species and could indicate that assumptions of one or more of the methods have been violated. In either case, the inferences drawn from species delimitation studies should be conservative, for in most contexts it is better to fail to delimit species than it is to falsely delimit entities that do not represent actual evolutionary lineages.
Species are a fundamental unit for biological studies, yet no uniform guidelines exist for determining species limits in an objective manner. Given the large number of species concepts available, defining species can be both highly subjective and biased. Although morphology has been commonly used to determine species boundaries, the availability and prevalence of genetic data has allowed researchers to use such data to make inferences regarding species limits. Genetic data also have been used in the detection of cryptic species, where other lines of evidence (morphology in particular) may underestimate species diversity. In this study, we investigate species limits in a complex of morphologically conserved trapdoor spiders (Mygalomorphae, Antrodiaetidae, Aliatypus) from California. Multiple approaches were used to determine species boundaries in this highly genetically fragmented group, including both multilocus discovery and validation approaches (plus a chimeric approach). Additionally, we introduce a novel tree-based discovery approach using species trees. Results suggest that this complex includes multiple cryptic species, with two groupings consistently recovered across analyses. Due to incongruence across analyses for the remaining samples, we take a conservative approach and recognize a three species complex, and formally describe two new species (Aliatypus roxxiae, sp. nov. and Aliatypus starretti, sp. nov.). This study helps to clarify species limits in a genetically fragmented group and provides a framework for identifying and defining the cryptic lineage diversity that prevails in many organismal groups.
Empirical phylogeographic studies have progressively sampled greater numbers of loci over time, in part motivated by theoretical papers showing that estimates of key demographic parameters improve as the number of loci increases. Recently, next-generation sequencing has been applied to questions about organismal history, with the promise of revolutionizing the field. However, no systematic assessment of how phylogeographic data sets have changed over time with respect to overall size and information content has been performed. Here, we quantify the changing nature of these genetic data sets over the past 20 years, focusing on papers published in Molecular Ecology. We found that the number of independent loci, the total number of alleles sampled and the total number of single nucleotide polymorphisms (SNPs) per data set has improved over time, with particularly dramatic increases within the past 5 years. Interestingly, uniparentally inherited organellar markers (e.g. animal mitochondrial and plant chloroplast DNA) continue to represent an important component of phylogeographic data. Singlespecies studies (cf. comparative studies) that focus on vertebrates (particularly fish and to some extent, birds) represent the gold standard of phylogeographic data collection. Based on the current trajectory seen in our survey data, forecast modelling indicates that the median number of SNPs per data set for studies published by the end of the year 2016 may approach~20 000. This survey provides baseline information for understanding the evolution of phylogeographic data sets and underscores the fact that development of analytical methods for handling very large genetic data sets will be critical for facilitating growth of the field.Keywords: DNA sequences, information content, phylogeography, sampling, single nucleotide polymorphisms, temporal trends IntroductionPhylogeographers have been working to collect multilocus data ever since a series of theoretical papers pertinent to the discipline demonstrated that estimates of key demographic parameters improve as the number of loci increases (e.g. Edwards & Beerli 2000;Hey & Nielsen 2004;Felsenstein 2006;Carling & Brumfield 2007). Recent improvements in DNA sequencing technology have led to platforms with greater speed, resolution and/or output (e.g. Margulies et al. 2005;Bentley et al. 2008;Rothberg et al. 2011) when compared to the traditional Sanger method. These technological advances, together with the development of general-purpose protocols for discovering and screening many DNA sequence polymorphisms arrayed across a species' genome (e.g. Baird et al. 2008;Kerstens et al. 2009;Faircloth et al. 2012;Peterson et al. 2012), are transforming the field of phylogeography to one that is no longer data limited. Investigations concerned with reconstructing long-term population history generally require large numbers of sampled alleles (i.e. many individuals and populations), across multiple loci, to adequately characterize levels of diversity and spatial genetic structuring (McCor...
Model checking is a critical part of Bayesian data analysis, yet it remains largely unused in systematic studies. Phylogeny estimation has recently moved into an era of increasingly complex models that simultaneously account for multiple evolutionary processes, the statistical fit of these models to the data has rarely been tested. Here we develop a posterior predictive simulation-based model check for a commonly used multispecies coalescent model, implemented in *BEAST, and apply it to 25 published data sets. We show that poor model fit is detectable in the majority of data sets; that this poor fit can mislead phylogenetic estimation; and that in some cases it stems from processes of inherent interest to systematists. We suggest that as systematists scale up to phylogenomic data sets, which will be subject to a heterogeneous array of evolutionary processes, critically evaluating the fit of models to data is an analytical step that can no longer be ignored.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.