Abstract.Over the last two decades, the Alfred Wegener Institute (AWI) has been continuously committing to develop and sustain an e-Infrastructure for coherent discovery, visualization, dissemination and archival of scientific information in polar and marine regions. Most of the data originates from research activities being carried out in a wide range of AWI-operated research platforms: vessels, land-based stations, ocean-based stations and aircrafts. Archival and publishing in PANGAEA repository along with DOI assignment to individual datasets is a typical end-of-line step for most data owners.Within AWI, a workflow for data acquisition from vessel-mounted devices along with ingestion procedures for the raw data into the institutional archives has been well-established for many years. However, the increasing number of ocean-based stations and respective sensors along with heterogeneous project-driven requirements towards satellite communication, sensor monitoring, QA/QC control and validation, processing algorithms, visualization and dissemination has recently lead us to build a more generic and cost-effective framework. This framework, hereafter named O2A, has as main strength its seamless flow of sensor observation to archives and the fact that it complies with internationally used OGC standards and thus assuring interoperability in international context (e.g. SOS/SWE, WPS, WMS WFS,..).O2A is comprised of several extensible and exchangeable modules (e.g. controlled vocabularies and gazetteers, file type and structure validation, aggregation solutions, processing algorithms, etc) as well as various interoperability services. At the first data tier level, not only each sensor is being described following SensorML data model standards but the data is being fed to an SOS interface offering streaming solutions along with support to O&M encoding. Project administrators or data specialists are now able to monitor the individual sensors displayed in a map by simply clicking on the station and viewing the near real-time data for the selected station and sensor. In addition, the monitoring dashboards we built provide assistance to data scientists and administrators in terms of early detection of malfunction of sensors (e.g., email/SMS notification), filtering of data values for certain range (e.g. temperature values above a certain range) and data aggregation (e.g. calculation of daily averages).
<p>The O2A (Observation to Archive) is a data-flow framework for heterogeneous sources, including multiple institutions and scales of Earth observation. In the O2A, once data transmission is set up, processes are executed to automatically ingest (i.e. collect and harmonize) and quality control data in near real-time. We consider a web-based sensor description application to support transmission and harmonization of observational time-series data. We also consider a product-oriented quality control, where a standardized and scalable approach should integrate the diversity of sensors connected to the framework. A review of literature and observation networks of marine and terrestrial environments is under construction to allow us, for example, to characterize quality tests in use for generic and specific applications. In addition, we use a standardized quality flag scheme to support both user and technical levels of information. In our outlook, a quality score should pair the quality flag to indicate the overall plausibility of each individual data value or to measure the flagging uncertainty. In this work, we present concepts under development and give insights into the data ingest and quality control currently operating within the O2A framework.</p>
No abstract
<p>Today's fast digital growth made data the most essential tool for scientific progress in Earth Systems Science. Hence, we strive to assemble a modular research infrastructure comprising a collection of tools and services that allow researchers to turn big data into scientific outcomes.</p><p>Major roadblocks are (i) the increasing number and complexity of research platforms, devices, and sensors, (ii) the heterogeneous project-driven requirements towards, e. g., satellite data, sensor monitoring, quality assessment and control, processing, analysis and visualization, and (iii) the demand for near real time analyses.</p><p>These requirements have led us to build a generic and cost-effective framework <strong>O2A</strong> (<strong>O</strong>bservation <strong>to</strong> <strong>A</strong>rchive) to enable, control, and access the flow of sensor observations to archives and repositories.</p><p>By establishing O2A within major cooperative projects like <strong>MOSES</strong> and <strong>Digital Earth</strong> in the research field Earth and Environment of the German Helmholtz Association, we extend research data management services, computing powers, and skills to connect with the evolving software and storage services for data science. This fully supports the typical scientific workflow from its very beginning to its very end, that is, from data acquisition to final data publication.&#160;</p><p>The key modules of O2A's digital research infrastructure established by AWI to enable Digital Earth Science are implementing the <strong>FAIR</strong> principles:</p><ul><li><strong>Sensor Web</strong>, to register sensor applications and capture controlled meta data before and alongside any measurement in the field</li> <li><strong>Data ingest</strong>, allowing researchers to feed data into storage systems and processing pipelines in a prepared and documented way, at best in controlled NRT data streams</li> <li><strong>Dashboards, </strong>allowing researchers to find and access data and share and collaborate among partners</li> <li><strong>Workspace, </strong>enabling researchers to access and use data with research software in a cloud-based virtualized infrastructure that allows researchers to analyse massive amounts of data on the spot</li> <li><strong>Archiving </strong>and<strong> publishing data </strong>via repositories and Digital Object Identifiers (DOI).</li> </ul>
<p>Earth system cyberinfrastructures include three types of data services: repositories, collections, and federations. These services arrange data by their purpose, level of integration, and governance. &#160;For instance, registered data of uniform measurements fulfill the goal of publication but do not necessarily flow in an integrated data system. The data repository provides the first and high level of integration that strongly depends on the standardization of incoming data. One example here is the framework Observation to Archive and Analysis (O2A) that is operational and continuously developed at the Alfred-Wegener-Institute, Bremerhaven. A data repository is one of the components of the O2A framework and much of its functionality depends on the standardization of the incoming data. In this context, we focus on the development of a modular approach to provide the standardization and quality control for the monitoring of the near real-time data. Two modules are under development. First, the driver module transforms different tabular data to a common format. Second, the quality control module that runs the quality tests on the ingested data. Both modules rely on the sensor operator and on the data scientist, two actors that interact with both ends of the ingest component of the O2A framework (http://data.awi.de/o2a-doc). We demonstrate the driver and the quality control modules in the data flow within Digital Earth showcases that also connect repositories and federated databases to the end-user. The end-user is the scientist, who works closely in the development approach to ensure applicability. The result is the proven benefit of harmonizing data and metadata of multiple sources, easy integration and rapid assessment of the ingested data. Further, we discuss concepts and current development that aim at the enhanced monitoring and scientific workflow.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.