A digital map of the built environment is useful for a range of economic, emergency response, and urban planning exercises such as helping find places in app driven interfaces, helping emergency managers know what locations might be impacted by a flood or fire, and helping city planners proactively identify vulnerabilities and plan for how a city is growing. Since its inception in 2004, OpenStreetMap (OSM) sets the benchmark for open geospatial data and has become a key player in the public, research, and corporate realms. Following the foundations laid by OSM, several open geospatial products describing the built environment have blossomed including the Microsoft USA building footprint layer and the OpenAddress project. Each of these products use different data collection methods ranging from public contributions to artificial intelligence, and if taken together, could provide a comprehensive description of the built environment. Yet, these projects are still siloed, and their variety makes integration and interoperability a major challenge. Here, we document an approach for merging data from these three major open building datasets and outline a workflow that is scalable to the continental United States (CONUS). We show how the results can be structured as a knowledge graph over which machine learning models are built. These models can help propagate and complete unknown quantities that can then be leveraged in disaster management.
Traditionally, scientists have captured data on the history of Earth processes in the printed literature where the data sets, by the nature of the medium, are incomplete. The answer to this "information bottleneck" is geoinformatics, a revolution that is migrating information to more readily retrievable electronic formats. The CHRONOS System is a community facility that addresses the geoinformatics needs of sedimentary geology and paleobiology, emphasizes global correlation and time-series analysis, and directly supports cutting-edge research on topics that include the evolution and diversity of life, climate change, geochemical cycles, and many other aspects of the Earth system. The CHRONOS System provides simultaneous and seamless integration of hosted and federated databases with analytical and visualization tools. CHRONOS exploits software advances for translating between different data structures and terminologies to achieve interoperability within a distributed data management system.
Persistent identifiers are applied to an ever-increasing variety of research objects, including software, samples, models, people, instruments, grants, and projects, and there is a growing need to apply identifiers at a finer and finer granularity. Unfortunately, the systems developed over two decades ago to manage identifiers and the metadata describing the identified objects no longer scale. Communities working with physical samples have grappled with these three challenges of the increasing volume, variety, and variability of identified objects for many years. To address this dual challenge, the IGSN 2040 project explored how metadata and catalogues for physical samples could be shared at the scale of billions of samples across an ever-growing variety of users and disciplines. In this paper, we focus on how we scale identifiers and their describing metadata to billions of objects and who the actors involved with this system are. Our analysis of these requirements resulted in the definition of a minimum viable product and the design of an architecture that not only addresses the challenges of increasing volume and variety but, more importantly, is easy to implement because it reuses commonly used Web components. Our solution is based on a Web architectural model that utilises Schema.org, JSON-LD, and sitemaps. Applying these commonly used architectural patterns on the internet allows us to not only handle increasing variety but also enable better compliance with the FAIR Guiding Principles.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.