Proceedings of the 2018 1st International Conference on Internet and E-Business 2018
DOI: 10.1145/3230348.3230357
|View full text |Cite
|
Sign up to set email alerts
|

ICD-9 Tagging of Clinical Notes Using Topical Word Embedding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 23 publications
0
11
0
Order By: Relevance
“…The majority of studies have applied tokenization, followed by removal of stop words, removal of non-alphabetic characters and lowercase conversion. Apart from that, few studies also used other data processing steps such as regular expression matching [88], building dictionary or vocabulary [5,52], removing non-matching terms [78], removal of de-identified or confidential information [74,36]. Studies including [67,5,61,90,84,50,58,37] truncated the documents to a maximum length of 2,500 or 4,000 tokens in order to reduce the computational cost.…”
Section: Preprocessingmentioning
confidence: 99%
See 1 more Smart Citation
“…The majority of studies have applied tokenization, followed by removal of stop words, removal of non-alphabetic characters and lowercase conversion. Apart from that, few studies also used other data processing steps such as regular expression matching [88], building dictionary or vocabulary [5,52], removing non-matching terms [78], removal of de-identified or confidential information [74,36]. Studies including [67,5,61,90,84,50,58,37] truncated the documents to a maximum length of 2,500 or 4,000 tokens in order to reduce the computational cost.…”
Section: Preprocessingmentioning
confidence: 99%
“…To summarise, the majority of studies (n=17) applied Word2Vec embedding, followed by TF-IDF feature representation and BoW. Few studies Ayyar and Oliver [4], Lin et al [52], and Mascio et al [56] used GloVe embeddings, while Samonte et al [74] applied topical word embedding. There are a few studies that have not reported the embedding model except dimensions of embedding.…”
Section: Feature Extractionmentioning
confidence: 99%
“…Clinical notes are used to classify the top 10 ICD-9 codes and blocks. Enhanced Hierarchical Attention Network (EnHAN) [24] utilizes discharge summary to solve ICD-9 prediction problems. To deal with multi-class label problems, the method uses topical word embedding.…”
Section: Related Workmentioning
confidence: 99%
“…Patients may have more than one medical problems, which leads to multiple medical diagnoses. In the study by Samonte et al [ 20 ], the proposed method of Enhanced Hierarchical Attention Network (EnHAN) followed topical word embedding and word embedding to solve this multi-class labeling and multi-label classification approach. This approach achieved a high accuracy of 0.841.…”
Section: Introductionmentioning
confidence: 99%
“…Note that the authors in [ 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 ] used patient profile, clinical examination reports and physician diagnosis results to establish models for the prediction of certain diseases that are commonly seen or with high mortality rates. On the contrary, in this paper, we only make use of patients’ self-report data (i.e., the subjective component in the progress note of EMR) to establish a predictive model for a variety of diseases.…”
Section: Introductionmentioning
confidence: 99%