We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.
BioStudies (www.ebi.ac.uk/biostudies) is a new public database that organizes data from biological studies. Typically, but not exclusively, a study is associated with a publication. BioStudies offers a simple way to describe the study structure, and provides flexible data deposition tools and data access interfaces. The actual data can be stored either in BioStudies or remotely, or both. BioStudies imports supplementary data from Europe PMC, and is a resource for authors and publishers for packaging data during the manuscript preparation process. It also can support data management needs of collaborative projects. The growth in multiomics experiments and other multi-faceted approaches to life sciences research mean that studies result in a diversity of data outputs in multiple locations. BioStudies presents a solution to ensuring that all these data and the associated publication(s) can be found coherently in the longer term.
In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.
Bioimaging data have significant potential for reuse, but unlocking this potential requires systematic archiving of data and metadata in public databases. We propose draft metadata guidelines to begin addressing the needs of diverse communities within light and electron microscopy. We hope this publication and the proposed Recommended Metadata for Biological Images (REMBI) will stimulate discussions about their implementation and future extension.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.