There is an increasing evidence that smallholder farms contribute substantially to food production globally, yet spatially explicit data on agricultural field sizes are currently lacking. Automated field size delineation using remote sensing or the estimation of average farm size at subnational level using census data are two approaches that have been used. However, both have limitations, for example, automatic field size delineation using remote sensing has not yet been implemented at a global scale while the spatial resolution is very coarse when using census data. This paper demonstrates a unique approach to quantifying and mapping agricultural field size globally using crowdsourcing. A campaign was run in June 2017, where participants were asked to visually interpret very high resolution satellite imagery from Google Maps and Bing using the Geo‐Wiki application. During the campaign, participants collected field size data for 130 K unique locations around the globe. Using this sample, we have produced the most accurate global field size map to date and estimated the percentage of different field sizes, ranging from very small to very large, in agricultural areas at global, continental, and national levels. The results show that smallholder farms occupy up to 40% of agricultural areas globally, which means that, potentially, there are many more smallholder farms in comparison with the two different current global estimates of 12% and 24%. The global field size map and the crowdsourced data set are openly available and can be used for integrated assessment modeling, comparative studies of agricultural dynamics across different contexts, for training and validation of remote sensing field size delineation, and potential contributions to the Sustainable Development Goal of Ending hunger, achieve food security and improved nutrition and promote sustainable agriculture.
Very high resolution (VHR) satellite imagery from Google Earth and Microsoft Bing Maps is increasingly being used in a variety of applications from computer sciences to arts and humanities. In the field of remote sensing, one use of this imagery is to create reference data sets through visual interpretation, e.g., to complement existing training data or to aid in the validation of land-cover products. Through new applications such as Collect Earth, this imagery is also being used for monitoring purposes in the form of statistical surveys obtained through visual interpretation. However, little is known about where VHR satellite imagery exists globally or the dates of the imagery. Here we present a global overview of the spatial and temporal distribution of VHR satellite imagery in Google Earth and Microsoft Bing Maps. The results show an uneven availability globally, with biases in certain areas such as the USA, Europe and India, and with clear discontinuities at political borders. We also show that the availability of VHR imagery is currently not adequate for monitoring protected areas and deforestation, but is better suited for monitoring changes in cropland or urban areas using visual interpretation.
Floods affect more people globally than any other type of natural hazard. Great potential exists for new technologies to support flood disaster risk reduction. In addition to existing expert-based data collection and analysis, direct input from communities and citizens across the globe may also be used to monitor, validate, and reduce flood risk. New technologies have already been proven to effectively aid in humanitarian response and recovery. However, while ex-ante technologies are increasingly utilized to collect information on exposure, efforts directed towards assessing and monitoring hazards and vulnerability remain limited. Hazard model validation and social vulnerability assessment deserve particular attention. New technologies offer great potential for engaging people and facilitating the coproduction of knowledge.
Remote sensing, or Earth Observation (EO), is increasingly used to understand Earth system dynamics and create continuous and categorical maps of biophysical properties and land cover, especially based on recent advances in machine learning (ML). ML models typically require large, spatially explicit training datasets to make accurate predictions. Training data (TD) are typically generated by digitizing polygons on high spatial-resolution imagery, by collecting in situ data, or by using pre-existing datasets. TD are often assumed to accurately represent the truth, but in practice almost always have error, stemming from (1) sample design, and (2) sample collection errors. The latter is particularly relevant for image-interpreted TD, an increasingly commonly used method due to its practicality and the increasing training sample size requirements of modern ML algorithms. TD errors can cause substantial errors in the maps created using ML algorithms, which may impact map use and interpretation. Despite these potential errors and their real-world consequences for map-based decisions, TD error is often not accounted for or reported in EO research. Here we review the current practices for collecting and handling TD. We identify the sources of TD error, and illustrate their impacts using several case studies representing different EO applications (infrastructure mapping, global surface flux estimates, and agricultural monitoring), and provide guidelines for minimizing and accounting for TD errors. To harmonize terminology, we distinguish TD from three other classes of data that should be used to create and assess ML models: training reference data, used to assess the quality of TD during data generation; validation data, used to iteratively improve models; and map reference data, used only for final accuracy assessment. We focus primarily on TD, but our advice is generally applicable to all four classes, and we ground our review in established best practices for map accuracy assessment literature. EO researchers should start by determining the tolerable levels of map error and appropriate error metrics. Next, TD error should be minimized during sample design by choosing a representative spatio-temporal collection strategy, by using spatially and temporally relevant imagery and ancillary data sources during TD creation, and by selecting a set of legend definitions supported by the data. Furthermore, TD error can be minimized during the collection of individual samples by using consensus-based collection strategies, by directly comparing interpreted training observations against expert-generated training reference data to derive TD error metrics, and by providing image interpreters with thorough application-specific training. We strongly advise that TD error is incorporated in model outputs, either directly in bias and variance estimates or, at a minimum, by documenting the sources and implications of error. TD should be fully documented and made available via an open TD repository, allowing others to replicate and assess its use. To guide researchers in this process, we propose three tiers of TD error accounting standards. Finally, we advise researchers to clearly communicate the magnitude and impacts of TD error on map outputs, with specific consideration given to the likely map audience.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.