<p>Many geophysical data centers are being asked by their sponsors and funding agencies to provide information on what data and services are used by whom and for what purpose in greater detail than customary in the past, when bulk information about the number of users/accesses and volumes of download were deemed sufficient in most cases. Up to now, data centers generally offer anonymous access to large parts of their holdings, with different approaches to basic monitoring and access logging, e.g. by IP address, as a rough proxy, that allows one to infer geographical user distribution to some detail.&#160;</p><p>Already today, access to embargoed or otherwise restricted data, or to advanced functions like personal work spaces and computational resources, is usually protected by user authentication and authorisation. Standardization of the identity management protocols is a requirement for further supporting the federation of data centers and their services, also in light of future integration with cloud services or other integrated services. For example in seismology, federated data retrieval systems follow a specific credential process based on standards for data exchange and web services established and maintained by the International Federation of Digital Seismograph Networks (FDSN).&#160;</p><p>These new information requirements from funding agencies would, however, require implementing identity management systems and some sort of user identification / authentication to many or all data center services and resources. This is raising concerns within the data centers on a number of aspects: Evidence from other domains demonstrates that requiring authentication reduces the use of data center services; enforcing authentication is often perceived as being not in line with best practices for open science; implementing identity management for usage profiling may lead to significantly increased effort at the data centers, especially with regard to compliance with data protection legislation like GDPR, and it may significantly impede automated (scripted) machine-to-machine access; the level of detail that should be reported back to funding agencies is unclear and there are doubts whether detailed user profiling is a reasonable &#8216;performance indicator&#8217;. Indeed, such knowledge gathering on users needs to be obtained through technical implementations that take into account the impact on user experience, the impact on decades of research tool development, and the resources necessary to implement and operate such systems, whether embedded into the operational services or taking other forms such as surveys and outreach to user groups.</p><p>Relevant discussions have now started among representatives of major geophysical data centers so that interim plans can be shared, ideas and experiences exchanged, and standard approaches can be developed and recommended for consideration by the community. In these discussions we consider both scenarios where identity management is useful and relevant or where we may consolidate our views and arguments with respect to the general user data reporting requests.</p>
<p>The data center of the National Science Foundation&#8217;s Seismological Facility for the Advancement of Geoscience (SAGE), operated by IRIS Data Services, has evolved over the past 30 years to address the data accessibility needs of the scientific research community.&#160; In recent years a broad call for adherence to FAIR data principles has prompted repositories to increased activity to support them. As these principles are well aligned with the needs of data users, many of the FAIR principles are already supported and actively promoted by IRIS.&#160; Standardized metadata and data identifiers support findability. Open and standardized web services enable a high degree of accessibility. Interoperability is ensured by offering data in a combination of rich, domain-specific formats in addition to simple, text-based formats. The use of open, rich (domain-specific) format standards enables a high degree of reuse.&#160; Further advancement towards these principles includes: an introduction and dissemination of DOIs for data; and an introduction of Linked Data support, via JSON-LD, allowing scientific data brokers, catalogers and generic search systems to discover data. Naturally, some challenges remain such as: the granularity and mechanisms needed for persistent IDs for data; the reality that metadata is updated with corrections (having implications for FAIR data principles); and the complexity of data licensing in a repository with data contributed from individual PIs, national observatories, and international collaborations.&#160; In summary, IRIS Data Services is well along the path of adherence of FAIR data principles with more work to do. We will present the current status of these efforts and describe the key challenges that remain.</p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.