This article discusses structural, systems, and other types of bias that arise in matching new records to large databases. The focus is databases for bibliographic utilities, but other related database concerns will be discussed. Problems of satisfying a “match” with sufficient flexibility and rigor in an environment of imperfect data are presented, and sources of unintentional variance are discussed.
Purpose -Describing musical pieces, whether sound recordings, scores, librettos, videos, has always involved cataloger interpretation and judgment. There is considerable variation in records created for exactly the same item. And there is never "proof" that two records which seem to describe the same item actually do. This paper aims to address this issue. Design/methodology/approach -This paper describes some of the challenges encountered in developing software for matching music records, and some approaches to making the software reliable. Findings -The paper finds that matching can be used successfully to create GLIMIR clusters in the WorldCat database. Work is needed in several areas to complete the implementation, but intermediate results are promising.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.