The unprecedented size of the human population, along with its associated economic activities, has an ever‐increasing impact on global environments. Across the world, countries are concerned about the growing resource consumption and the capacity of ecosystems to provide resources. To effectively conserve biodiversity, it is essential to make indicators and knowledge openly available to decision‐makers in ways that they can effectively use them. The development and deployment of tools and techniques to generate these indicators require having access to trustworthy data from biological collections, field surveys and automated sensors, molecular data, and historic academic literature. The transformation of these raw data into synthesized information that is fit for use requires going through many refinement steps. The methodologies and techniques applied to manage and analyze these data constitute an area usually called biodiversity informatics. Biodiversity data follow a life cycle consisting of planning, collection, certification, description, preservation, discovery, integration, and analysis. Researchers, whether producers or consumers of biodiversity data, will likely perform activities related to at least one of these steps. This article explores each stage of the life cycle of biodiversity data, discussing its methodologies, tools, and challenges. This article is categorized under: Algorithmic Development > Biological Data Mining
Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process.
The field of distributional ecology has seen considerable recent attention, particularly surrounding the theory, protocols, and tools for Ecological Niche Modeling (ENM) or Species Distribution Modeling (SDM). Such analyses have grown steadily over the past two decades—including a maturation of relevant theory and key concepts—but methodological consensus has yet to be reached. In response, and following an online course taught in Spanish in 2018, we designed a comprehensive English-language course covering much of the underlying theory and methods currently applied in this broad field. Here, we summarize that course, ENM2020, and provide links by which resources produced for it can be accessed into the future. ENM2020 lasted 43 weeks, with presentations from 52 instructors, who engaged with >2500 participants globally through >14,000 hours of viewing and >90,000 views of instructional video and question-and-answer sessions. Each major topic was introduced by an “Overview” talk, followed by more detailed lectures on subtopics. The hierarchical and modular format of the course permits updates, corrections, or alternative viewpoints, and generally facilitates revision and reuse, including the use of only the Overview lectures for introductory courses. All course materials are free and openly accessible (CC-BY license) to ensure these resources remain available to all interested in distributional ecology.
Plataformas de streaming de música são cada vez mais populares, democratizando e facilitando o acesso ao conteúdo musical. Esse efeito amplia o alcance e a penetração de diferentes estilos musicais, incrementando a diversidade de gêneros escutados nos diferentes páıses do mundo. A fim de melhor entender essa diversidade e identificar páıses com interesses em comum, neste artigo foi construída e analisada uma rede complex de artistas, gêneros musicais e páıses utilizando dados do Spotify, uma das plataformas de streaming de música mais utilizadas atualmente. Como resultados, além de identificar comunidades de páıses com estilos musicais semelhantes, nós mostramos como a grande quantidade e diversidade de gêneros musicais pode influenciar a modelagem e análise da rede considerada. Nós também classificamos os gêneros mais comumente escutados utilizando diferentes métricas de centralidade.
Reproducibility is a fundamental requirement of the scientific process since it enables outcomes to be replicated and verified. Computational scientific experiments can benefit from improved reproducibility for many reasons, including validation of results and reuse by other scientists. However, designing reproducible experiments remains a challenge and hence the need for developing methodologies and tools that can support this process. Here, we propose a conceptual model for reproducibility to specify its main attributes and properties, along with a framework that allows for computational experiments to be findable, accessible, interoperable, and reusable. We present a case study in ecological niche modeling to demonstrate and evaluate the implementation of this framework.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.