The data management landscape associated with the Global Ocean Observing System is distributed, complex, and only loosely coordinated. Yet interoperability across this distributed landscape is essential to enable data to be reused, preserved, and integrated and to minimize costs in the process. A building block for a distributed system in which component systems can exchange and understand information is standardization of data formats, distribution protocols, and metadata. By reviewing several data management use cases we attempt to characterize the current state of ocean data interoperability and make suggestions for continued evolution of the interoperability standards underpinning the data system. We reaffirm the technical data standard recommendations from previous OceanObs conferences and suggest incremental improvements to them that can help the GOOS data system address the significant challenges that remain in order to develop a truly multidisciplinary data system.
This article describes the publication of occurrences of Southern Elephant Seals Mirounga leonina (Linnaeus, 1758) as Linked Open Data in two environments (marine and coastal). The data constitutes hydrographic measurements of instrumented animals and observation data collected during censuses between 1990 and 2017. The data scheme is based on the previously developed ontology BiGe-Onto and the new version of the Semantic Sensor Network ontology (SSN). We introduce the network of ontologies used to organize the data and the transformation process to publish the dataset. In the use case, we develop an application to access and analyze the dataset. The linked open dataset and the related visualization tool turned data into a resource that can be located by the international community and thus increase the commitment to its sustainability. The data, coming from Península Valdés (UNESCO World Heritage), is available for interdisciplinary studies of management and conservation of marine and coastal protected areas which demand reliable and updated data.
We describe the creation and quality assurance of a dataset containing nearly all available precinct-level election results from the 2016, 2018, and 2020 American elections. Precincts are the smallest level of election administration, and election results at this granularity are needed to address many important questions. However, election results are individually reported by each state with little standardization or data quality assurance. We have collected, cleaned, and standardized precinct-level election results from every available race above the very local level in almost every state across the last three national election years. Our data include nearly every candidate for president, US Congress, governor, or state legislator, and hundreds of thousands of precinct-level results for judicial races, other statewide races, and even local races and ballot initiatives. In this article we describe the process of finding this information and standardizing it. Then we aggregate the precinct-level results up to geographies that have official totals, and show that our totals never differ from the official nationwide data by more than 0.457%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.