This article focuses on PubMed’s Best Match sorting algorithm, presenting a simplified explanation of how it operates and highlighting how artificial intelligence affects search results in ways that are not seen by users. We further discuss user search behaviors and the ethical implications of algorithms, specifically for health care practitioners. PubMed recently began using artificial intelligence to improve the sorting of search results using a Best Match option. In 2020, PubMed deployed this algorithm as the default search method, necessitating serious discussion around the ethics of this and similar algorithms, as users do not always know when an algorithm uses artificial intelligence, what artificial intelligence is, and how it may impact their everyday tasks. These implications resonate strongly in health care, in which the speed and relevancy of search results is crucial but does not negate the importance of a lack of bias in how those search results are selected or presented to the user. As a health care provider will not often venture past the first few results in search of a clinical decision, will Best Match help them find the answers they need more quickly? Or will the algorithm bias their results, leading to the potential suppression of more recent or relevant results?
The Federated Research Data Repository (FRDR), developed through a partnership between the Canadian Association of Research Libraries’ Portage initiative and the Compute Canada Federation, improves research data discovery in Canada by providing a single search portal for research data stored across Canadian governmental, institutional, and discipline-specific data repositories. While this national discovery layer helps to de-silo Canadian research data, challenges in data discovery remain due to a lack of standardized metadata practices across repositories. In recognition of this challenge, a Portage task group, drawn from a national network of experts, has engaged in a project to map subject keywords to the Online Computer Library Center’s (OCLC) Faceted Application of Subject Terminology (FAST) using the open source OpenRefine software. This paper will describe the task group’s project, discuss the various approaches undertaken by the group, and explore how this work improves data discovery and may be adopted by other repositories and metadata aggregators to support metadata standardization.
The authors would like to thank Eka Grguric and Jessica Lange for their feedback on the survey tool, and thank the anonymous reviewers and editors at The Journal of Library Metadata for their thoughtful comments and suggestions on earlier drafts of this article.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.