The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments

Gonçalves, Rafael S.; O’Connor, Martin J.; Martínez-Romero, Marcos; Egyedi, Attila L.; Willrett, Debra; Graybeal, John; Musen, Mark A.

doi:10.1007/978-3-319-68204-4_10

Cited by 32 publications

(28 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We integrated our value-recommendation approach into a metadata collection and management platform called the CEDAR Workbench (27), which was developed by the Center for Expanded Data Annotation and Retrieval (CEDAR) (5). The CEDAR Workbench is a Web-based system comprising a set of highly-interactive tools to help create, manage, and submit biomedical metadata for use in online data repositories.…”

Section: Methodsmentioning

confidence: 99%

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases

et al. 2019

Self Cite

View full text Add to dashboard Cite

Metadata—the machine-readable descriptions of the data—are increasingly seen as crucial for describing the vast array of biomedical datasets that are currently being deposited in public repositories. While most public repositories have firm requirements that metadata must accompany submitted datasets, the quality of those metadata is generally very poor. A key problem is that the typical metadata acquisition process is onerous and time consuming, with little interactive guidance or assistance provided to users. Secondary problems include the lack of validation and sparse use of standardized terms or ontologies when authoring metadata. There is a pressing need for improvements to the metadata acquisition process that will help users to enter metadata quickly and accurately. In this paper, we outline a recommendation system for metadata that aims to address this challenge. Our approach uses association rule mining to uncover hidden associations among metadata values and to represent them in the form of association rules. These rules are then used to present users with real-time recommendations when authoring metadata. The novelties of our method are that it is able to combine analyses of metadata from multiple repositories when generating recommendations and can enhance those recommendations by aligning them with ontology terms. We implemented our approach as a service integrated into the CEDAR Workbench metadata authoring platform, and evaluated it using metadata from two public biomedical repositories: US-based National Center for Biotechnology Information BioSample and European Bioinformatics Institute BioSamples. The results show that our approach is able to use analyses of previously entered metadata coupled with ontology-based mappings to present users with accurate recommendations when authoring metadata.

show abstract

Section: Methodsmentioning

confidence: 99%

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…Metadata captured in these wizards is then automatically transformed into the format specified by the repository, removing this task from the user. The design is distinct from that of Dendro 15 and CEDAR 16 , two other platforms with related capabilities. Dendro provides linear web forms.…”

Section: Researchers Need Help To Biocuratementioning

confidence: 99%

COPO: a metadata platform for brokering FAIR data in the life sciences

Etuk

Shaw

González-Beltrán

et al. 2019

Preprint

View full text Add to dashboard Cite

Scientific innovation is increasingly reliant on data and computational resources. Much of today's life science research involves generating, processing, and reusing heterogeneous datasets that are growing exponentially in size. Demand for technical experts (data scientists and bioinformaticians) to process these data is at an all-time high, but these are not typically trained in good data management practices. That said, we have come a long way in the last decade, with funders, publishers, and researchers themselves making the case for open, interoperable data as a key component of an open science philosophy. In response, recognition of the FAIR Principles (that data should be Findable, Accessible, Interoperable and Reusable) has become commonplace. However, both technical and cultural challenges for the implementation of these principles still exist when storing, managing, analysing and disseminating both legacy and new data.COPO is a computational system that attempts to address some of these challenges by enabling scientists to describe their research objects (raw or processed data, publications, samples, images, etc.) using community-sanctioned metadata sets and vocabularies, and then use public or institutional repositories to share it with the wider scientific community. COPO encourages data generators to adhere to appropriate metadata standards when publishing research objects, using semantic terms to add meaning to them and specify relationships between them. This allows data consumers, be they people or machines, to find, aggregate, and analyse data which would otherwise be private or invisible. Building upon existing standards to push the state of the art in scientific data dissemination whilst minimising the burden of data publication and sharing. Availability:COPO is entirely open source and freely available on GitHub at https://github.com/collaborative-open-plant-omics . A public instance of the platform for use by the community, as well as more information, can be found at copo-project.org .

show abstract

“…At the same time, a systematic and well-documented extraction process is essential to keep the curated metadata updated over time and portable between different projects [32]. Perhaps the sole example of an open-source, web-based framework for the acquisition, storage, search, and reuse of scientific metadata is the CEDAR workbench [17]. On the one hand, the entirety of neuroscience is too broad and diverse to fully benefit from an all-encompassing metadata annotation tool.…”

Section: Introductionmentioning

confidence: 99%

An open-source framework for neuroscience metadata management applied to digital reconstructions of neuronal morphology

2020

View full text Add to dashboard Cite

Research advancements in neuroscience entail the production of a substantial amount of data requiring interpretation, analysis, and integration. The complexity and diversity of neuroscience data necessitate the development of specialized databases and associated standards and protocols. NeuroMorpho.Org is an online repository of over one hundred thousand digitally reconstructed neurons and glia shared by hundreds of laboratories worldwide. Every entry of this public resource is associated with essential metadata describing animal species, anatomical region, cell type, experimental condition, and additional information relevant to contextualize the morphological content. Until recently, the lack of a user-friendly, structured metadata annotation system relying on standardized terminologies constituted a major hindrance in this effort, limiting the data release pace. Over the past 2 years, we have transitioned the original spreadsheet-based metadata annotation system of NeuroMorpho.Org to a custom-developed, robust, web-based framework for extracting, structuring, and managing neuroscience information. Here we release the metadata portal publicly and explain its functionality to enable usage by data contributors. This framework facilitates metadata annotation, improves terminology management, and accelerates data sharing. Moreover, its open-source development provides the opportunity of adapting and extending the code base to other related research projects with similar requirements. This metadata portal is a beneficial web companion to NeuroMorpho.Org which saves time, reduces errors, and aims to minimize the barrier for direct knowledge sharing by domain experts. The underlying framework can be progressively augmented with the integration of increasingly autonomous machine intelligence components.

show abstract

The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments

Cited by 32 publications

References 17 publications

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases

COPO: a metadata platform for brokering FAIR data in the life sciences

An open-source framework for neuroscience metadata management applied to digital reconstructions of neuronal morphology

Contact Info

Product

Resources

About