Significance Only an estimated 1 to 10% of Earth’s species have been formally described. This discrepancy between the number of species with a formal taxonomic description and actual number of species (i.e., the Linnean shortfall) hampers research across the biological sciences. To explore whether the Linnean shortfall results from poor taxonomic practice or not enough taxonomic effort, we applied machine-learning techniques to build a predictive model to identify named species that are likely to contain hidden diversity. Results indicate that small-bodied species with large, climatically variable ranges are most likely to contain hidden species. These attributes generally match those identified in the taxonomic literature, indicating that the Linnean shortfall is caused by societal underinvestment in taxonomy rather than poor taxonomic practice.
The open-science movement seeks to increase transparency, reproducibility, and access to scientific data. As primary data, preserved biological specimens represent records of global biodiversity critical to research, conservation, national security, and public health. However, a recent decrease in specimen preservation in public biorepositories is a major barrier to open biological science. As such, there is an urgent need for a cultural shift in the life sciences that normalizes specimen deposition in museum collections. Museums embody an open-science ethos and provide long-term research infrastructure through curation, data management and security, and community-wide access to samples and data, thereby ensuring scientific reproducibility and extension. We propose that a paradigm shift from specimen ownership to specimen stewardship can be achieved through increased open-data requirements among scientific journals and institutional requirements for specimen deposition by funding and permitting agencies, and through explicit integration of specimens into existing data management plan guidelines and annual reporting.
Patterns of genetic diversity within species contain information the history of that species, including how they have responded to historical climate change and how easily the organism is able to disperse across its habitat. More than 40,000 phylogeographic and population genetic investigations have been published to date, each collecting genetic data from hundreds of samples. Despite these millions of data points, meta‐analyses are challenging because the synthesis of results across hundreds of studies, each using different methods and forms of analysis, is a daunting and time‐consuming task. It is more efficient to proceed by repurposing existing data and using automated data analysis. To facilitate data repurposing, we created a database (phylogatR) that aggregates data from different sources and conducts automated multiple sequence alignments and data curation to provide users with nearly ready‐to‐analyse sets of data for thousands of species. Two types of scientific research will be made easier by phylogatR: large meta‐analyses of thousands of species that can address classic questions in evolutionary biology and ecology, and student‐ or citizen‐ science based investigations that will introduce a broad range of people to the analysis of genetic data. phylogatR enhances the value of existing data via the creation of software and web‐based tools that enable these data to be recycled and reanalysed and increase accessibility to big data for research laboratories and classroom instructors with limited computational expertise and resources.
Table 1. Species used in analysis. For each species, the scientific name, type of organism, type of data, number of sequences, and reference of original publication is shown. SpeciesBroad Taxon Type of Data # sequences Original publication Bryopsis sp. Green Algae cpDNA 66 Krellwitz et al. (2001) Gracilaria tikvahiae Red Algae cpDNA 20 Gurgel et al.(2004) Xerula furfuracea Fungi nuDNA 41 Yang et al.(2009) & Petersen and Hughes (2010) & Hao et al.(2016) Sphagnum bartlettianum Bryophyta cpDNA + nuDNA 12 Shaw et al.(2005) Acer rubrum Angiosperm cpDNA 38 McLachlan et al.(2005) Apios americana Angiosperm nuDNA 18 Joly & Bruneau (2004) Dicerandra spp Angiosperm cpDNA 30 Oliveira et al.(2007) Fagus grandifolia Angiosperm cpDNA 23 McLachlan et al.(2005) Liquidambar styraciflua Angiosperm cpDNA 109 Morris et al.(2008) Prunus spp Angiosperm cpDNA 226 Shaw & Small (2005) Tilia americana Angiosperm cpDNA 297 McCarthy and Mason-Gamer (2016) Trillium cuneatum Angiosperm cpDNA 281 Gonzales et al.(2008) Uniola paniculata Angiosperm cpDNA 131 Hodel & Gonzales (2013) Bugula neritina Bryozoa mtDNA 30 McGovern & Hellberg (2003) Daphnia obtusa Crustacean mtDNA 36 Penton et al.(2004) Emerita talpoida Crustacean mtDNA 4 Tam et al.(1996) Farfantepenaeus aztecus Crustacean mtDNA 76 McMillen-Jackson and Bert (2003) Litopenaeus setiferus Crustacean mtDNA 92 McMillen-Jackson and Bert (2003) & Maggioni et al. (2001) &Vazquez-Bader et al.(2004) & Bremer et al.(2010) Pagarus longicarpus Crustacean mtDNA 67 Young et al.(2002) Pagarus pollicaris Crustacean mtDNA 13 Young et al.(2002) Busycon sinistrum Gastropod mtDNA 31 Wise et al.(2004) Lampsilis altilis Mollusk mtDNA 5 Roe et al.(2001) Lampsilis australis Mollusk mtDNA 5 Roe et al.(2001) Lampsilis ovata Mollusk mtDNA 2 Roe et al.(2001) & Campbell et al.(2005) Lampsilis perovalis Mollusk mtDNA 5 Roe et al.(2001) Lampsilis teres Mollusk mtDNA 2 Roe et al.(2001) & Lydeard et al.(2000) Spisula solidissima Mollusk mtDNA 52 Hare and Weinberg (2005) Ambystoma tigrinum Amphibian mtDNA 56 Church et al.(2003) Desmognathus wrightii Amphibian mtDNA 29 Crespi et al.(2003) Eumeces fasciatus Amphibian mtDNA 82 Howes et al.(2006) Eurycea bislineata Amphibian mtDNA 56 Kozak et al.(2006) Eurycea cirrigera Amphibian mtDNA 251 Kozak et al.(2006) Eurycea junaluska Amphibian mtDNA 6 Kozak et al.(2006) Eurycea multiplicata Amphibian mtDNA 46 Bonett & Chippindale (2004) Eurycea tymerensis Amphibian mtDNA 16 Bonett & Chippindale (2004) Eurycea wilderae Amphibian mtDNA 129 Kozak et al.(2006)
Intraspecific genetic diversity is a key aspect of biodiversity. Quaternary climatic change and glaciation influenced intraspecific genetic diversity by promoting range shifts and population size change. However, the extent to which glaciation affected genetic diversity on a global scale is not well established. Here we quantify nucleotide diversity, a common metric of intraspecific genetic diversity, in more than 38,000 plant and animal species using georeferenced DNA sequences from millions of samples. Results demonstrate that tropical species contain significantly more intraspecific genetic diversity than nontropical species. To explore potential evolutionary processes that may have contributed to this pattern, we calculated summary statistics that measure population demographic change and detected significant correlations between these statistics and latitude. We find that nontropical species are more likely to deviate from neutral expectations, indicating that they have historically experienced dramatic fluctuations in population size likely associated with Pleistocene glacial cycles. By analyzing the most comprehensive data set to date, our results imply that Quaternary climate perturbations may be more important as a process driving the latitudinal gradient in species richness than previously appreciated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.