2021
DOI: 10.1101/2021.01.28.428644
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species

Abstract: Biodiversity image repositories are crucial sources of training data for machine learning approaches to biological research. Metadata, specifically metadata about object quality, is putatively an important prerequisite to selecting sample subsets for these experiments. This study demonstrates the importance of image quality metadata to a species classification experiment involving a corpus of 1935 fish specimen images which were annotated with 22 metadata quality properties. A small subset of high quality imag… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(14 citation statements)
references
References 15 publications
0
14
0
Order By: Relevance
“…This point is also emphasized by Wieczorek et al [11] in their report on the variety of DwC metadata extensions needed to meet growing community concerns and requirements, including data quality and fitness. This point is addressed in detail by Leipzig et al [5], drawing from Tulane University's manual curation of 22 metadata properties that characterize digitized specimen image quality, and further motivates the research reported in this paper.…”
Section: R E L a T E Dw O R K 21 Metadata Standards And Approaches Fo...mentioning
confidence: 71%
See 1 more Smart Citation
“…This point is also emphasized by Wieczorek et al [11] in their report on the variety of DwC metadata extensions needed to meet growing community concerns and requirements, including data quality and fitness. This point is addressed in detail by Leipzig et al [5], drawing from Tulane University's manual curation of 22 metadata properties that characterize digitized specimen image quality, and further motivates the research reported in this paper.…”
Section: R E L a T E Dw O R K 21 Metadata Standards And Approaches Fo...mentioning
confidence: 71%
“…A key impetus has been engagement of both teams in the NSF Biology Guided Neural Networks (BGNN) project, which is developing a novel class of artificial neural networks that aims to exploit machine readable, predictive knowledge associated with specimen images, phylogenies, and anatomical ontologies. Initial research successfully demonstrated computational approaches for creating image quality metadata [5]; and, further, that by combining ML and image informatics, researchers automatically determine image quality and metadata, such as fish quantity, location and orientation, and image scaling based on ruler identification and measurement [6].…”
Section: Introductionmentioning
confidence: 99%
“…Technicians employed by Tulane University have manually generated the 22 metadata properties deemed crucial to the overall BGNN project [9] for a large number of INHS images. 20, 699 total entries were created by 13 technicians that spanned 8, 398 unique images, of which 7, 244 were both not part of the detectron training set and met our current admissibility criteria for detectron and pixel processing.…”
Section: Resultsmentioning
confidence: 99%
“…Using ML and image informatics algorithms, it is able to locate, mask and analyze specimens (currently limited to fish) in collection images with a high degree of accuracy. It produces 6 of the 22 core BGNN metadata properties [9], as well as image contrast, bounding boxes, scale and length information. Testing this approach on 7, 244 images from the INHS dataset [4], we see that the vast majority of the resulting metadata is correct within a tolerance of a few percentage points, and in some cases contains fewer mistakes than the manually generated validation data.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation