Several scholarly knowledge graphs have been proposed to model and analyze the academic landscape. However, although the number of data sets has increased remarkably in recent years, these knowledge graphs do not primarily focus on data sets but rather associated entities such as publications. Moreover, publicly available data set knowledge graphs do not systematically contain links to the publications in which the data sets are mentioned. In this paper, we present an approach for constructing an RDF knowledge graph that fulfills these mentioned criteria. Our data set knowledge graph, DSKG, is publicly available at http://dskg.org and contains metadata of data sets for all scientific disciplines. To ensure high data quality of the DSKG, we first identify suitable raw data set collections for creating the DSKG. We then establish links between the data sets and publications modeled in the Microsoft Academic Knowledge Graph that mention these data sets. As the author names of data sets can be ambiguous, we develop and evaluate a method for author name disambiguation and enrich the knowledge graph with links to ORCID. Overall, our knowledge graph contains more than 2,000 data sets with associated properties, as well as 814,000 links to 635,000 scientific publications. It can be used for a variety of scenarios, facilitating advanced data set search systems and new ways of measuring and awarding the provisioning of data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.