2022
DOI: 10.48550/arxiv.2208.02707
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Standardizing and Centralizing Datasets to Enable Efficient Training of Agricultural Deep Learning Models

Abstract: In recent years, deep learning models have become the standard for agricultural computer vision. Such models are typically fine-tuned to agricultural tasks using model weights that were originally fit to more general, non-agricultural datasets. This lack of agriculture-specific fine-tuning potentially increases training time and resource use, and decreases model performance, leading an overall decrease in data efficiency. To overcome this limitation, we collect a wide range of existing public datasets for thre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 32 publications
0
3
0
Order By: Relevance
“…Rapid measurement of traits associated with these processes has the potential to enhance our understanding of their distribution and interactions with canopy structure. In comparison to many analytical tools for plant phenotyping such as PlantCV [ 51 ] and HSI-PP [ 52 ] for plant image analysis, and AgML [ 22 ] and CropSight for data management [ 53 ], the current framework addresses the need for synthetic imagery and offers a solution for multimodal analysis tools. Compared to the LESS [ 20 ] and DART-Lux [ 27 ] models, the current framework offers integrated models for generating and modifying model geometry via Helios, making it especially suited for proximal remote sensing applications.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Rapid measurement of traits associated with these processes has the potential to enhance our understanding of their distribution and interactions with canopy structure. In comparison to many analytical tools for plant phenotyping such as PlantCV [ 51 ] and HSI-PP [ 52 ] for plant image analysis, and AgML [ 22 ] and CropSight for data management [ 53 ], the current framework addresses the need for synthetic imagery and offers a solution for multimodal analysis tools. Compared to the LESS [ 20 ] and DART-Lux [ 27 ] models, the current framework offers integrated models for generating and modifying model geometry via Helios, making it especially suited for proximal remote sensing applications.…”
Section: Discussionmentioning
confidence: 99%
“…Although several publicly available annotated image datasets for agricultural applications are available for machine learning model training and other plant phenotyping applications, such as the Annotated Crop Image Dataset [ 21 ], Michigan State University Plant Imagery Dataset (MSU-PID) [ 13 ], AgML [ 22 ], and KOMATSUNA dataset [ 15 ], these data repositories are not sufficiently broad to capture the extensive variability that exists within agricultural machine learning tasks. The limitations of machine learning approaches become evident with small and low-variation datasets, as they can lead to severe overfitting, and the resulting models are often not readily transferable across different light conditions, species, or phenotyping platforms, revealing a lack of generalization and posing a substantial risk of extrapolation errors [ 2 , 22 ]. Past researchers have utilized data augmentation methods like random cropping, scaling, rotation, and flipping in the spatial domain [ 21 , 23 ], and introduction of random variations in mean offset and slope of the spectral reflectance [ 24 ].…”
Section: Introductionmentioning
confidence: 99%
“…The AgML project ( https://github.com/Project-AgML/AgML ) is an open-source platform designed to assist the training of machine learning models in agriculture, providing a standardised format for managing data and models ( Joshi et al , 2022 ). A common platform, specific to agriculture, with a consistent data management, training and evaluation pipeline would greatly benefit the efficiency of weed recognition and plant phenotyping DL research and development.…”
Section: From Data To Decisionmentioning
confidence: 99%