2020
DOI: 10.1002/asi.24411
|View full text |Cite
|
Sign up to set email alerts
|

Understanding the stability of medical concept embeddings

Abstract: Frequency is one of the major factors for training quality word embeddings. Several studies have recently discussed the stability of word embeddings in general domain and suggested factors influencing the stability. In this work, we conduct a detailed analysis on the stability of concept embeddings in medical domain, particularly in relations with concept frequency. The analysis reveals the surprising high stability of low‐frequency concepts: low‐frequency (<100) concepts have the same high stability as high‐f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 21 publications
0
4
0
Order By: Relevance
“…Word2Vec [8,11,21,28,32], GloVe [1,22,32], PPMI [1,22,32], SGNS [1,18,22], SVD [22], LSA [1], SGHS [18] Word2Vec, GloVe and fastText…”
Section: Methods Studiedmentioning
confidence: 99%
See 3 more Smart Citations
“…Word2Vec [8,11,21,28,32], GloVe [1,22,32], PPMI [1,22,32], SGNS [1,18,22], SVD [22], LSA [1], SGHS [18] Word2Vec, GloVe and fastText…”
Section: Methods Studiedmentioning
confidence: 99%
“…• Wikipedia (1.5B) [22] • NYT(58M) and Europarl (61M) [32] • Brown, Project Gutenberg and Reuters (10k each) [8] • US Federal Courts of Appeals (38k), NYT (22k) and Reddit (26k) [1] • NIPS between 2007 and 2012 (2M) [11] • Ohsumed dataset (34M) [21] • Google N-gram cor-pus: English Fiction(4.8B) and German(0.7B) [18] • BNC and ACL An-thology Reference corpus [28] Wikipedia, News-Crawl (2007), Lyrics and Europarl (50M each)…”
Section: Previous Work Our Workmentioning
confidence: 99%
See 2 more Smart Citations