The effort and cost required to convert satellite Earth Observation (EO) data into meaningful geophysical variables has prevented the systematic analysis of all available observations. To overcome these problems, we utilise an integrated High Performance Computing and Data environment to rapidly process, restructure and analyse the Australian Landsat data archive. In this approach, the EO data are assigned to a common grid framework that spans the full geospatial and temporal extent of the observations -the EO Data Cube. This approach is pixel-based and incorporates geometric and spectral calibration and quality assurance of each Earth surface reflectance measurement. We demonstrate the utility of the approach with rapid time-series mapping of surface water across the entire Australian continent using 27 years of continuous, 25 m resolution observations. Our preliminary analysis of the Landsat archive shows how the EO Data Cube can effectively liberate high-resolution EO data from their complex sensor-specific data structures and revolutionise our ability to measure environmental change.ARTICLE HISTORY
Datacubes are increasingly being implemented to manage big data workflows efficiently, particularly those for processing geospatial data. However, there is confusion in both the definition of the term “datacube” and the choices for how it is implemented. This and the conventional approach to managing spatial data (i.e., in map-projected data sets) have led to a restricted set of datacube implementations that are each tightly coupled to the spatial constraints of the data and how they are stored on disc – resulting in barriers to interoperability, particularly on global scales. This article discusses options and how it is possible to implement a datacube based on discrete global grid systems, while using the same topologies as conventional datacubes. These provide a flexible spatial data infrastructure that leverages the same topological advantages as conventional geospatial datacubes, while reducing barriers to data interoperability of both raster and vector data and providing additional functionality. Also, they potentially provide a very efficient approach to connecting to big data sources in order to extract datasets on demand prior to proceeding to multi-level intelligent big data processing, mining, machine learning, and visualizations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.