<p>In the distributed heterogeneous environmental data ecosystems, the number of data sources, volume and variances of derivatives, purposes, formats, and replicas are increasingly growing. In theory, this can enrich the information system as a whole, enabling new data value to be revealed via the combination and fusion of several data sources and data types, searching for further relevant information hidden behind the variety of expressions, formats, replicas, and unknown reliability. It is now visible how complex data alignment is, and even more, it is not always justified due to capacity and business issues. One of the challenging, but also most rewarding approaches is semantic alignment, which promises to fill the information gap of data discovery and joins. To formalise one, an inevitable enabler is an aligned, linked, and machine readable data model enabling the specification of relations between data elements generated information. The Iliad - digital twins of the ocean are cases of this kind, where in-situ data and citizen science observations are mixed with multidimensional environmental data to enable data science and what-if models implementation and to be integrated into even broader ecosystems like the European Digital Twin Ocean (EDITO) and European Data Spaces. An Ocean Information Model (OIM) that will enable traversals and profiles is the semantic backbone of the ecosystem. Defined as the multi-level ontology, it will explain data using well known generic (Darwin Core, WoT), spatio-temporal (SOSA/SSN, OGC Geo, W3C Time, QUDT, W3C RDF Data Cube, WoT) and domain (WORMS, AGROVOC) ontologies. Machine readability and unambiguity allow for both automated validation and some translations.<br>On the other hand, efficient use of this requires yet another skill in data management and development besides GIS, ICT and domain expertise. In addition, as the semantics used in the data and metadata have not yet been stabilised on the implementation level, it introduces a few more flexibilities of data expression. Following the GEO data sharing and data management principles along with FAIR, CARE and TRUST, the environmental data is prepared for harmonisation. Furthermore, to ease the entry and to harmonise conventions, the authors introduce a multi-touchpoint data value chain API suite with an aligned approach to semantically enrich, entail and validate data sets such as observations streams in JSON or JSON-LD based on OIM, through storage and scientific data in NetCDF to exposing this semantically aligned data via the newly endorsed and already successful OGC Environmental Data Retrieval API. The practical approach is supported by a ready-to-use toolbox of components that presents portable tools to build and validate multi-source geospatial data integrations keeping track of the information added during mesh-up and predictions and what-if implementations.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.