2018
DOI: 10.1038/sdata.2018.273
|View full text |Cite
|
Sign up to set email alerts
|

Columbia Open Health Data, clinical concept prevalence and co-occurrence from electronic health records

Abstract: Columbia Open Health Data (COHD) is a publicly accessible database of electronic health record (EHR) prevalence and co-occurrence frequencies between conditions, drugs, procedures, and demographics. COHD was derived from Columbia University Irving Medical Center’s Observational Health Data Sciences and Informatics (OHDSI) database. The lifetime dataset, derived from all records, contains 36,578 single concepts (11,952 conditions, 12,334 drugs, and 10,816 procedures) and 32,788,901 concept pairs from 5,364,781 … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
51
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 49 publications
(52 citation statements)
references
References 32 publications
1
51
0
Order By: Relevance
“…Among these, the drug dexamethasone is predicted to treat posterior uveitis (a type of inflammation in the eye) with a probability of 0.838129. Using an additional knowledge source, Colombia Open Health Data (COHD) [86] which provides anonymized key word search over 1.7 million health records, we find that dexamethasone co-occurs with posterior uveitis at a rate 2.789 times more frequently than would be expected in a general population. This may indicate that dexamethasone is being prescribed as an off-label treatment for this disease.…”
Section: Validation Of Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Among these, the drug dexamethasone is predicted to treat posterior uveitis (a type of inflammation in the eye) with a probability of 0.838129. Using an additional knowledge source, Colombia Open Health Data (COHD) [86] which provides anonymized key word search over 1.7 million health records, we find that dexamethasone co-occurs with posterior uveitis at a rate 2.789 times more frequently than would be expected in a general population. This may indicate that dexamethasone is being prescribed as an off-label treatment for this disease.…”
Section: Validation Of Resultsmentioning
confidence: 99%
“…The blue and yellow lines indicate the performance of the node2vec and DWPC approaches respectively when restricting the training set to only 147 diseases to facilitate a direct comparison due to efficiency constraints experienced by the DWPC method. such as when we used the plot as in Figure 3 to find a candidate threshold that filters out all but the most likely drug repurposing candidates or leveraging additional data sources such as Columbia Open Health Data [86] as in Section 3.1.…”
Section: Discussionmentioning
confidence: 99%
“…For the sake of patient privacy and security, it is usually quite difficult, if not impossible, for medical institutes to grant public access to large-scale raw or even de-identified clinical texts [2]. Consequently, medical terms 1 and their aggregated co-occurrence counts extracted from raw clinical texts are becoming a popular (although not perfect) substitute for raw clinical texts for the research community to study EMR data [2,8,33]. For example, Finlayson et al [8] released millions of medical terms extracted from the clinical texts in Stanford Hospitals and Clinics as well as their global co-occurrence counts, rather than releasing raw sentences/paragraphs/documents from the clinical text corpus.…”
Section: Global Context Informationmentioning
confidence: 99%
“…For example, Finlayson et al [8] prune the edges between two terms co-occurring less than 100 times, which can lead to missing edges between two related terms in the co-occurrence graph. Ta et al [33] remove all concepts with singleton frequency counts below 10. Hence, the noisy nature of the co-occurrence graph makes it less accurate to embed a term based on their original contexts.…”
Section: Global Context Informationmentioning
confidence: 99%
See 1 more Smart Citation