Purpose The required training sample size for a particular machine learning (ML) model applied to medical imaging data is often unknown. The purpose of this study was to provide a descriptive review of current sample-size determination methodologies in ML applied to medical imaging and to propose recommendations for future work in the field. Methods We conducted a systematic literature search of articles using Medline and Embase with keywords including “machine learning,” “image,” and “sample size.” The search included articles published between 1946 and 2018. Data regarding the ML task, sample size, and train-test pipeline were collected. Results A total of 167 articles were identified, of which 22 were included for qualitative analysis. There were only 4 studies that discussed sample-size determination methodologies, and 18 that tested the effect of sample size on model performance as part of an exploratory analysis. The observed methods could be categorized as pre hoc model-based approaches, which relied on features of the algorithm, or post hoc curve-fitting approaches requiring empirical testing to model and extrapolate algorithm performance as a function of sample size. Between studies, we observed great variability in performance testing procedures used for curve-fitting, model assessment methods, and reporting of confidence in sample sizes. Conclusions Our study highlights the scarcity of research in training set size determination methodologies applied to ML in medical imaging, emphasizes the need to standardize current reporting practices, and guides future work in development and streamlining of pre hoc and post hoc sample size approaches.
A correct delineation of agricultural parcels is a primary requirement for any parcel-based application such as the estimate of agricultural subsidies. Currently, high-resolution remote-sensing images provide useful spatial information to delineate parcels; however, their manual processing is highly time consuming. Thus, it is necessary to create methods which allow performing this task automatically. In this work, the use of a machine-learning algorithm to delineate agricultural parcels is explored through a novel methodology. The proposed methodology combines superpixels and supervised classification in order to determine which adjacent superpixels should be merged, transforming the segmentation issue into a machine learning matter. A visual evaluation of results obtained by the methodology applied to two areas of a high-resolution satellite image of fragmented agricultural landscape points out that the use of machine-learning algorithm for this task is promising.
Abstract:Very high resolution remotely sensed images are an important tool for monitoring fragmented agricultural landscapes, which allows farmers and policy makers to make better decisions regarding management practices. An object-based methodology is proposed for automatic generation of thematic maps of the available classes in the scene, which combines edge-based and superpixel processing for small agricultural parcels. The methodology employs superpixels instead of pixels as minimal processing units, and provides a link between them and meaningful objects (obtained by the edge-based method) in order to facilitate the analysis of parcels. Performance analysis on a scene dominated by agricultural small parcels indicates that the combination of both superpixel and edge-based methods achieves a classification accuracy slightly better than when those methods are performed separately and comparable to the accuracy of traditional object-based analysis, with automatic approach.
Accurate and up-to-date information on the spatial and geographical characteristics of agricultural areas is an indispensable value for the various activities related to agriculture and research. Most agricultural studies and policies are carried out at the field level, for which precise boundaries are required. Today, high-resolution remote sensing images provide useful spatial information for plot delineation; however, manual processing is time-consuming and prone to human error. The objective of this paper is to explore the potential of deep learning (DL) approach, in particular a convolutional neural network (CNN) model, for the automatic outlining of agricultural plot boundaries from orthophotos over large areas with a heterogeneous landscape. Since DL approaches require a large amount of labeled data to learn, we have exploited the open data from the Land Parcel Identification System (LPIS) from the Chartered Community of Navarre, Spain. The boundaries of the agricultural plots obtained from our methodology were compared with those obtained using a state-of-the-art methodology known as gPb-UCM (global probability of boundary followed by ultrametric contour map) through an error measurement called the boundary displacement error index (BDE). In BDE terms, the results obtained by our method outperform those obtained from the gPb-UCM method. In this regard, CNN models trained with LPIS data are a useful and powerful tool that would reduce intensive manual labor in outlining agricultural plots. INDEX TERMS Convolutional neural network, deep learning, edge extraction, land parcel identification system, parcels delineation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.