The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments.
The first part of this two-part series argues that the assumption of topic matching between user needs and texts topically relevant to those needs is often erroneous. This second part reports an empirical investigation of the question, "What relationship types actually account for topical relevance?" In order to avoid the bias of topic-matching search strategies, user needs are back-generated from a randomly selected subset of the subject headings employed in a user-oriented topical concordance.The corresponding relevant texts are those indicated in the concordance under the subject heading. The study compares the topics of the user needs with the topics of the relevant texts to determine the relationships between them. This examination reveals that topical relevance relationships include a large variety of relationships, only some of which are matching relationships.Others are examples of paradigmatic relationships or syntagmatic relationships. Indeed, there appear to be no constraints on the kinds of relationships that can function as topical relevance relationships. They are distinguishable from other types of relationships only on functional grounds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.