DEBBIE: The Open Access Database of Experimental Scaffolds and Biomaterials Built Using an Automated Text Mining Pipeline

Corvi, Javier O.; McKitrick, Austin; Fernández, José M.; Fuenteslópez, Carla V.; Gelpí, Josep L.; Ginebra, Maria‐Pau; Capella‐Gutierrez, Salvador; Hakimi, Osnat

doi:10.1002/adhm.202300150

Adv Healthcare Materials

2023

DOI: 10.1002/adhm.202300150

|View full text |Cite

DEBBIE: The Open Access Database of Experimental Scaffolds and Biomaterials Built Using an Automated Text Mining Pipeline

Javier O. Corvi,

Austin McKitrick,

José M. Fernández

et al.

Abstract: Biomaterials research output has experienced an exponential increase over the last three decades. The majority of research is published in the form of scientific articles and is therefore available as unstructured text, making it a challenging input for computational processing. Computational tools are becoming essential to overcome this information overload. Among them, text mining systems present an attractive option for the automated extraction of information from text documents into structured datasets. Th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Article2

Relationship

Self Cite0

Independent2

Authors

Journals

Cited by 2 publications

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Biomaterials text mining: A hands-on comparative study of methods on polydioxanone biocompatibility

Fuenteslópez,

McKitrick,

Corvi

et al. 2023

New Biotechnology

View full text Add to dashboard Cite

Biomaterials text mining: A hands-on comparative study of methods on polydioxanone biocompatibility

Fuenteslópez,

McKitrick,

Corvi

et al. 2023

New Biotechnology

View full text Add to dashboard Cite

Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical–protein relations

Miranda-Escalada,

Mehryary,

Luoma

et al. 2023

Database

View full text Add to dashboard Cite

It is getting increasingly challenging to efficiently exploit drug-related information described in the growing amount of scientific literature. Indeed, for drug–gene/protein interactions, the challenge is even bigger, considering the scattered information sources and types of interactions. However, their systematic, large-scale exploitation is key for developing tools, impacting knowledge fields as diverse as drug design or metabolic pathway research. Previous efforts in the extraction of drug–gene/protein interactions from the literature did not address these scalability and granularity issues. To tackle them, we have organized the DrugProt track at BioCreative VII. In the context of the track, we have released the DrugProt Gold Standard corpus, a collection of 5000 PubMed abstracts, manually annotated with granular drug–gene/protein interactions. We have proposed a novel large-scale track to evaluate the capacity of natural language processing systems to scale to the range of millions of documents, and generate with their predictions a silver standard knowledge graph of 53 993 602 nodes and 19 367 406 edges. Its use exceeds the shared task and points toward pharmacological and biological applications such as drug discovery or continuous database curation. Finally, we have created a persistent evaluation scenario on CodaLab to continuously evaluate new relation extraction systems that may arise. Thirty teams from four continents, which involved 110 people, sent 107 submission runs for the Main DrugProt track, and nine teams submitted 21 runs for the Large Scale DrugProt track. Most participants implemented deep learning approaches based on pretrained transformer-like language models (LMs) such as BERT or BioBERT, reaching precision and recall values as high as 0.9167 and 0.9542 for some relation types. Finally, some initial explorations of the applicability of the knowledge graph have shown its potential to explore the chemical–protein relations described in the literature, or chemical compound–enzyme interactions. Database URL: https://doi.org/10.5281/zenodo.4955410

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

DEBBIE: The Open Access Database of Experimental Scaffolds and Biomaterials Built Using an Automated Text Mining Pipeline

Cited by 2 publications

References 34 publications

Biomaterials text mining: A hands-on comparative study of methods on polydioxanone biocompatibility

Biomaterials text mining: A hands-on comparative study of methods on polydioxanone biocompatibility

Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical–protein relations

Contact Info

Product

Resources

About