The French Critical Zone research infrastructure, OZCAR-RI, gathers 20 observatories sampling various compartments of the critical zone, each having developed their own data management and distribution systems. A common information system (Theia/OZCAR IS) was built to make their in situ observation FAIR (findable, accessible, interoperable, reusable). The IS architecture was designed after consultation of the users, data producers and IT teams involved in data management. A common data model based on various metadata standards was defined to create information fluxes between observatories' ISs and the Theia/OZCAR IS. Controlled vocabularies were defined to develop a data discovery web portal offering a faceted search with various criteria, including variables names and categories that were harmonized in a thesaurus published on the web. This paper describes the IS architecture, the pivot data model and open-source solutions used to implement data discovery, and future steps to implement data downloading and interoperability services.
<p>Understanding, modeling and predicting the future of the Earth System in response to global change is a challenge for the Earth system scientific community, but a necessity to address pressing societal needs related to the UN Sustainable Development Goals and risk monitoring and prediction. These &#8220;wicked&#8221; environmental problems require the building of integrated modeling tools . The latter will only provide reliable response if they integrate all existing multi-disciplinary data sources. Open science and data sharing using the FAIR (Findable, Accessible, Interoperable, Reusable) principles provide the framework for such data sharing. However, when trying to put it into practice, we face a large fragmentation of the landscape, with different communities having developed their own data management systems, standards and tools.</p> <p>When starting to work on the Theia/OZCAR Information System (IS) that aims to Facilitate the discovery, to make FAIR, in-situ data of continental surfaces collected by French research organizations and their foreign partners, we performed a &#8220;Tour de France&#8221; to understand the critical zone science users&#8217; needs when searching for data. The common criterion that emerged was the variables names. We believe that this need is general to all disciplines involved in Earth System sciences and is all the more important when data is searched by scientists of other disciplines that are not familiar with the vocabularies of the other communities. This abstract aim is to share our experience in building the tools aiming at harmonizing and sharing variables names using FAIR principles.</p> <p>In the Theia/OZCAR critical zone research community, long term observatories that produce the data have heterogeneous data description practices and variable names. They may be different for the same variable (i.e.: "soil moisture", "soil water content", "humidit&#233; des sols", etc.). Moreover, it is not possible to infer automatically or semi-automatically similarities between these variables names. In order to identify these similarities and implement data discovery functionalities on these dimensions in the IS, we built the Theia/OZCAR variable thesaurus. To enable technical interoperability of the thesaurus, it is published on the web using the SKOS vocabulary description standard. Other thesauri used in environmental sciences in Europe and worldwide have been identified and the definition of associative relationships with these vocabularies ensures the semantic interoperability of the Theia/OZCAR thesaurus. However, it is quite common that the variable names used for the search dimensions remain general (e.g. "soil moisture") and are not specific enough for the end user to interpret exactly what has been measured (e.g. "soil moisture at 10 cm depth measured by TDR probe"). Therefore, to improve data reuse and interoperability, the thesaurus now follows a recommendation of the Research Data Alliance and implements the I-ADOPT framework to describe the variables more precisely. Each variable is composed and described by relationships with atomic concepts whose definition is specified. The use of these atomic concepts enhances interoperability with other catalogues or services and contributes to the reuse of the data by other communities that those who collected them.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.