Premise The digitization of natural history collections includes transcribing specimen label data into standardized formats. Born‐digital specimen data initially gathered in digital formats do not need to be transcribed, enabling their efficient integration into digitized collections. Modernizing field collection methods for born‐digital workflows requires the development of new tools and processes. Methods and Results collNotes, a mobile application, was developed for Android and iOS to supplement traditional field journals. Designed for efficiency in the field, collNotes avoids redundant data entries and does not require cellular service. collBook, a companion desktop application, refines field notes into database‐ready formats and produces specimen labels. Conclusions collNotes and collBook can be used in combination as a field‐to‐database solution for gathering born‐digital voucher specimen data for plants and fungi. Both programs are open source and use common file types simplifying either program's integration into existing workflows.
Premise Large‐scale efforts to digitize herbaria have resulted in more than 18 million publicly available Plantae images on sites such as iDigBio. The automation of image post‐processing will lead to time savings in the digitization of biological specimens, as well as improvements in data quality. Here, new and modified neural network methodologies were developed to automatically detect color reference charts (CRC), enabling the future automation of various post‐processing tasks. Methods and Results We used 1000 herbarium specimen images from 52 herbaria to test our novel neural network model, ColorNet, which was developed to identify CRCs smaller than 4 cm2, resulting in a 30% increase in accuracy over the performance of other state‐of‐the‐art models such as Faster R‐CNN. For larger CRCs, we propose modifications to Faster R‐CNN to increase inference speed. Conclusions Our proposed neural networks detect a range of CRCs, which may enable the automation of post‐processing tasks found in herbarium digitization workflows, such as image orientation or white balance correction.
Herbaria are invaluable sources for understanding the natural world, and in recent years there has been a concerted effort to digitize these collections. To organize such efforts, a method for estimating the necessary labor is desired. This work analyzes digitization productivity reports of 105 participants from eight herbaria, deriving generalized labor estimates that account for human experience.METHODS AND RESULTS: Individuals' rates of digitization were grouped based on cumulative time performing each task and then used to estimate a series of generalized labor projection models. In most cases, productivity was shown to improve with experience, suggesting longer technician retention can reduce labor requirements by 20%. CONCLUSIONS:Using student labor is a common tactic for digitization efforts, and the resulting outreach exposes future professionals to natural history collections. However, overcoming the learning curve should be considered when estimating the labor necessary to digitize a collection. KEY WORDS biodiversity data; digitization rates; herbaria; natural history collections. Applications in Plant Sciences 9(4): e11415 Powell et al.-Estimating herbarium specimen digitization rates • 2 of 8
Taxonomy is at the center of modern biodiversity science. No species can be systematically studied until it is defined, and no observation can be linked to related data without a taxonomic label. However, taxonomy is also a science in constant flux—even well-studied groups like Mammalia have fluctuated by >25% in recognized species in the last decade (Burgin et al. 2018, MDD 2022a, MDD 2022b). As a result, there are calls to create a “global list of accepted species” to increase taxonomic stability, particularly for policy decisions in biodiversity conservation and management (Garnett et al. 2020). The counterargument notes that forcing definitional consensus is likely to further inequities, and that a pluralistic, coordinated approach to taxonomy can be achieved with innovative cyberinfrastructure designs and services (Sterner et al. 2020, Franz and Sterner 2018). Here, we propose that digitally “extended” taxonomic curation can play new and innovative roles in linking observational data to alternative taxonomic concepts; and enabling fit-for-use taxonomy to inform policy decisions. linking observational data to alternative taxonomic concepts; and enabling fit-for-use taxonomy to inform policy decisions. Taxonomic curators (TCs) have traditionally limited their activities to making lists of accepted species and higher taxa. However, most of today's biodiversity questions require observational data (e.g., specimen occurrences) that are taxonomically coherent, not just name lists, and for those linked data to be digitally available in public databases. If the collective activities of TCs can be effectively unified across distributed networks, they might facilitate the transition to Extended Specimen Networks of taxonomically coherent biodiversity data, a core goal of current research initiatives (e.g., Lendemer et al. (2020)). Beyond lists of species names is the domain of what names mean in practice (i.e., taxonomic concepts), which often differs by author (Fig. 1). Here we argue that curating the various lines of evidence that represent taxonomic concepts—what we call Species Meaning Artifacts (SMArts)—is a promising strategy for keeping track of how species splits and lumps will affect observational data records in the Global Biodiversity Information Facility (GBIF) or National Center for Biotechnology Information (NCBI). Instead of labeling records by a static name, records can be digitally associated with SMArt evidence from alternative taxonomies (e.g., geographic range maps before/after a species split). Networks of TCs curating digital SMArts will enable 'taxonomically intelligent' data aggregation (Bisby 2000), a long-pursued goal in biodiversity data science that, once realized, promises to enable investigations ranging from viral spillover to biodiversity loss (Upham et al. 2021).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.