2020
DOI: 10.26686/wgtn.13151078.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Ontology-Guided Data Augmentation for Medical Document Classification

Abstract: Extracting meaningful features from unstructured text is one of the most challenging tasks in medical document classification. The various domain specific expressions and synonyms in the clinical discharge notes make it more challenging to analyse them. The case becomes worse for short texts such as abstract documents. These challenges can lead to poor classification accuracy. As the medical input data is often not enough in the real world, in this work a novel ontology-guided method is proposed for data augme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 5 publications
0
2
0
Order By: Relevance
“…These methods replace words of the original training examples by similar words (e.g., synonyms) from a thesaurus (Jungiewicz and Smywinski-Pohl, 2019;Abdollahi et al, 2020) or words with similar embeddings (Wang and Yang, 2015). More recent work uses large language models, pre-trained to predict masked tokens, which suggest replacements of randomly masked words of the original examples (Kobayashi, 2018;Wu et al, 2019).…”
Section: Word Substitutionmentioning
confidence: 99%
See 1 more Smart Citation
“…These methods replace words of the original training examples by similar words (e.g., synonyms) from a thesaurus (Jungiewicz and Smywinski-Pohl, 2019;Abdollahi et al, 2020) or words with similar embeddings (Wang and Yang, 2015). More recent work uses large language models, pre-trained to predict masked tokens, which suggest replacements of randomly masked words of the original examples (Kobayashi, 2018;Wu et al, 2019).…”
Section: Word Substitutionmentioning
confidence: 99%
“…Word substitution is a simple and common da approach in nlp. In thesaurus-based substitution (Jungiewicz and Smywinski-Pohl, 2019;Abdollahi et al, 2020), words are replaced by synonyms or closely related words (e.g., hypernyms). Word embedding substitution (Wang and Yang, 2015) replaces words by others nearby in a pre-trained vector space model (Section 3.4).…”
Section: Related Workmentioning
confidence: 99%