2013 IEEE 13th International Conference on Data Mining Workshops 2013
DOI: 10.1109/icdmw.2013.91
|View full text |Cite
|
Sign up to set email alerts
|

An Empirical Analysis of Topic Modeling for Mining Cancer Clinical Notes

Abstract: Using a variety of techniques including Topic Modeling, Principal Component Analysis and Bi-clustering, we explore electronic patient records in the form of unstructured clinical notes and genetic mutation test results. Our ultimate goal is to gain insight into a unique body of clinical data, specifically regarding the topics discussed within the note content and relationships between patient clinical notes and their underlying genetics.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
17
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
2
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(17 citation statements)
references
References 17 publications
0
17
0
Order By: Relevance
“…We compare our proposed model with Labeled LDA [17], a supervised counterpart of Latent Dirichlet Allocation (LDA) [18], which has been applied previously to clinical data analysis [19]–[21]. The results show that our representations indeed capture the relationship between words and codes.…”
Section: Introductionmentioning
confidence: 99%
“…We compare our proposed model with Labeled LDA [17], a supervised counterpart of Latent Dirichlet Allocation (LDA) [18], which has been applied previously to clinical data analysis [19]–[21]. The results show that our representations indeed capture the relationship between words and codes.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, topic models have been employed in the clinical domain in problems such as cased-based retrieval [21]; characterizing clinical concepts over time [22]; and predicting patient satisfaction [23], depression [24], infection [25], and mortality [26]. Additional work has been performed in using topic modeling methods to search for relationships between themes discovered in clinical notes and underlying patient genetics [27]. …”
Section: Introductionmentioning
confidence: 99%
“…This technique is a critical component in most text analytics pipelines. The resulting topic models can provide informative and concise summaries of the content of large corpora, which can effectively support their exploration and analysis in diverse domains such as health care [7] and education [35]. However, evaluation of topic modeling results is challenging because no general gold standard is available, and automatic evaluation metrics, like topic coherence [39], at best weakly correlate with human judgments of quality [8,10].…”
Section: Introductionmentioning
confidence: 99%