From language models to large-scale food and biomedical knowledge graphs

Cenikj, Gjorgjina; Strojnik, Lidija; Angelski, R; Ogrinc, Nives; Seljak, Barbara Koroušić; Eftimov, Tome

doi:10.1038/s41598-023-34981-4

Cited by 3 publications

(1 citation statement)

References 52 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the first approach, LLMs can be used to construct, enrich and refine KGs from text, leveraging LLMs’ ability to extract and recognize structure (Fig. 1a ), e.g., as has been applied in the construction of dietary KGs 25 and KGs for precision medicine 26 . This is an important application, and it illustrates how modern KGs are generated efficiently through automated machine learning approaches, and not the output of laborious and non-scalable manual approaches.…”

Section: Smoothing Out the Limitations Of Llmsmentioning

confidence: 99%

Augmented non-hallucinating large language models as medical information curators

Gilbert,

Kather,

Hogan

2024

npj Digit. Med.

View full text Add to dashboard Cite

Reliably processing and interlinking medical information has been recognized as a critical foundation to the digital transformation of medical workflows, and despite the development of medical ontologies, the optimization of these has been a major bottleneck to digital medicine. The advent of large language models has brought great excitement, and maybe a solution to the medicines’ ‘communication problem’ is in sight, but how can the known weaknesses of these models, such as hallucination and non-determinism, be tempered? Retrieval Augmented Generation, particularly through knowledge graphs, is an automated approach that can deliver structured reasoning and a model of truth alongside LLMs, relevant to information structuring and therefore also to decision support.

show abstract

Section: Smoothing Out the Limitations Of Llmsmentioning

confidence: 99%

Augmented non-hallucinating large language models as medical information curators

Gilbert,

Kather,

Hogan

2024

npj Digit. Med.

View full text Add to dashboard Cite

show abstract

FoodAtlas: Automated knowledge extraction of food and chemicals from literature

Youn,

Li,

Simmons

et al. 2024

Computers in Biology and Medicine

View full text Add to dashboard Cite

FoodAtlas: Automated Knowledge Extraction of Food and Chemicals from Literature

Youn,

Li,

Simmons

et al. 2024

Preprint

View full text Add to dashboard Cite

Automated generation of knowledge graphs that accurately capture published information can help with knowledge organization and access, which have the potential to accelerate discovery and innovation. Here, we present an integrated pipeline to construct a large-scale knowledge graph using large language models in an active learning setting. We apply our pipeline to the association of raw food, ingredients, and chemicals, a domain that lacks such knowledge resources. By using an iterative active learning approach of 4,120 manually curated premise-hypothesis pairs as training data for ten consecutive cycles, the entailment model extracted 230,848 food-chemical composition relationships from 155,260 scientific papers, with 106,082 (46.0%) of them never been reported in any published database. To augment the knowledge incorporated in the knowledge graph, we further incorporated information from 5 external databases and ontology sources. We then applied a link prediction model to identify putative food-chemical relationships that were not part of the constructed knowledge graph. Validation of the 443 hypotheses generated by the link prediction model resulted in 355 new food-chemical relationships, while results show that the model score correlates well (R2 = 0.70) with the probability of a novel finding. This work demonstrates how automated learning from literature at scale can accelerate discovery and support practical applications through reproducible, evidence-based capture of latent interactions of diverse entities, such as food and chemicals.

show abstract

From language models to large-scale food and biomedical knowledge graphs

Cited by 3 publications

References 52 publications

Augmented non-hallucinating large language models as medical information curators

Augmented non-hallucinating large language models as medical information curators

FoodAtlas: Automated knowledge extraction of food and chemicals from literature

FoodAtlas: Automated Knowledge Extraction of Food and Chemicals from Literature

Contact Info

Product

Resources

About