2022
DOI: 10.1007/s00799-022-00342-1
|View full text |Cite
|
Sign up to set email alerts
|

Computational metadata generation methods for biological specimen image collections

Abstract: Metadata is a key data source for researchers seeking to apply machine learning (ML) to the vast collections of digitized biological specimens that can be found online. Unfortunately, the available metadata is often sparse and, at times, erroneous. This paper extends previous research with the Illinois Natural History Survey (INHS) collection (7,244 specimen images) using computational approaches to analyze image quality, and then automatically generate 22 metadata properties representing the image quality and… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 48 publications
0
7
0
Order By: Relevance
“…Few dynamic methods that rely on machine learning and computer vision techniques have been developed to automate CF determination. One study used an object detection algorithm to locate the number “2” and the number “3” on rulers to compute CFs, but was limited to only two ruler types; the authors advocated for an approach that located tick marks directly (Karnani et al, 2022 ). With LeafMachine2, we required a more generalizable procedure for obtaining image‐specific CFs.…”
Section: Methodsmentioning
confidence: 99%
“…Few dynamic methods that rely on machine learning and computer vision techniques have been developed to automate CF determination. One study used an object detection algorithm to locate the number “2” and the number “3” on rulers to compute CFs, but was limited to only two ruler types; the authors advocated for an approach that located tick marks directly (Karnani et al, 2022 ). With LeafMachine2, we required a more generalizable procedure for obtaining image‐specific CFs.…”
Section: Methodsmentioning
confidence: 99%
“…The first ML component (Figures 1 and 2, step 4a) performs object detection and metadata generation as defined by the rule 2021), and expanded upon in Karnani et al (2022). The domain scientists and software engineer worked with the ML researchers to containerize their codebase for reuse in this workflow.…”
Section: Metadata Generationmentioning
confidence: 99%
“…MorphoSource (Boyer et al., 2016), iDigBio (http://idigbio.org) and iNaturalist (http://inaturalist.org)]. Image data are ripe for applications of ML techniques, including neural networks (NN), to extract information such as metadata (Karnani et al., 2022; Leipzig et al., 2021; Pepper et al., 2021; Rinaldo et al., 2022; Stork et al., 2019), species classification (Schuettpelz et al., 2017; Wäldchen & Mäder, 2018; Wilf et al., 2016) and presence of traits (Alfaro et al., 2019; Lürig et al., 2021; MacLeod, 2017; Weeks et al., 2016). Although ML offers powerful tools for automatic object detection and subsequent analysis of biological image data, no single ML technique provides a complete solution.…”
Section: Introductionmentioning
confidence: 99%
“…Together, species identification and taxonomic and meta-data extraction methods from images represent a powerful tool for unlocking the full potential of natural history collections. These approaches can make data more discoverable and usable for documenting biodiversity both in collections and in the field (Karnani et al, 2022;Schuettpelz et al, 2017;Wäldchen and Mäder, 2018;White et al, 2020). Information on specimens is not limited to museum catalogues but is also available in the wealth of scientific publications detailing and imaging specimens for varied purposes.…”
Section: Identifying and Cataloguing Specimen Datamentioning
confidence: 99%