Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus

Nguyen, Vinh; Yip, Hong Yung; Bajaj, Goonmeet; Wijesiriwardene, Thilini; Javangula, Vishesh; Parthasarathy, Srinivasan; Sheth, Amit P.; Bodenreider, Olivier

doi:10.1145/3485447.3511946

Cited by 2 publications

(8 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This section summarizes the key concepts in the UMLS Metathesaurus to be used in the UVA problem. Additional details and examples can be found in [32,33] and the UMLS manual.…”

Section: Knowledge Representation In the Umls Metathesaurus For The U...mentioning

confidence: 99%

“…This section summarizes the main points from the three existing baselines from our prior work including the rule-based approximation, the LexLMs [33], and the ConLMs [32]. Additional details can be found in these original papers.…”

Section: Uva Baselinesmentioning

confidence: 99%

“…Therefore, our strategy for addressing this problem is finding the best performing approaches in the sampling evaluations as candidates for later full scale evaluations. We have recently achieved preliminary results [2,31,32,33,41] in the sampling evaluation phase.…”

Section: Introductionmentioning

confidence: 99%

“…Our recent work [2,31,32,33,41] developed several approaches for addressing the UVA problem. In [33], we described the RBA (rule-based approximation) approach that approximates the current construction process into a set of rules.…”

Section: Introductionmentioning

confidence: 99%

“…Although the LexLM largely outperformed the RBA, we noted as a limitation of this work that we only leveraged lexical information and did not include any contextual information. In [32], we addressed this limitation of the LexLM model by incorporating the contextual information into the ConLM model. We also attempted to improve the performance of the LexLM by adding an attention layer into the neural network in [31].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

UVA Resources for the Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus

Nguyen¹,

Bodenreider²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

The construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus is time-consuming, costly, and error-prone as it relies on (1) the lexical and semantic processing for suggesting synonymous terms, and (2) the expertise of UMLS editors for curating the suggestions. For improving the UMLS Metathesaurus construction process, our research group has defined a new task called UVA (UMLS Vocabulary Alignment) and generated a dataset for evaluating the task. Our group has also developed different baselines for this task using logical rules (RBA), and neural networks (LexLM and ConLM). In this paper, we present a set of reusable and reproducible resources including (1) a dataset generator, (2) three datasets generated by using the generator, and (3) three baseline approaches. We describe the UVA dataset generator and its implementation generalized for any given UMLS release. We demonstrate the use of the dataset generator by generating datasets corresponding to three UMLS releases, 2020AA, 2021AA and 2021AB. We provide three UVA baselines using the three existing approaches (LexLM, ConLM, and RBA). The code, the datasets, and the experiments are publicly available, reusable, and reproducible with any UMLS release (a no-cost license agreement is required for downloading the UMLS).

show abstract

“…This section summarizes the key concepts in the UMLS Metathesaurus to be used in the UVA problem. Additional details and examples can be found in [32,33] and the UMLS manual.…”

Section: Knowledge Representation In the Umls Metathesaurus For The U...mentioning

confidence: 99%

Section: Uva Baselinesmentioning

confidence: 99%