2019
DOI: 10.1186/s12911-019-0805-0
|View full text |Cite
|
Sign up to set email alerts
|

Identifying clinically important COPD sub-types using data-driven approaches in primary care population based electronic health records

Abstract: Background COPD is a highly heterogeneous disease composed of different phenotypes with different aetiological and prognostic profiles and current classification systems do not fully capture this heterogeneity. In this study we sought to discover, describe and validate COPD subtypes using cluster analysis on data derived from electronic health records. Methods We applied two unsupervised learning algorithms (k-means and hierarchical clustering) in 30,961 current and for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
81
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 65 publications
(82 citation statements)
references
References 37 publications
1
81
0
Order By: Relevance
“…It creates the most statistically valid clusters, minimizing distances within clusters and maximizing distances between clusters, such that the subgroups are based only on the data themselves, without intervention of the researcher . Recently, the feasibility of cluster analyses for untangling disease heterogeneity and identifying subgroups has gained interest …”
Section: Discussionmentioning
confidence: 99%
“…It creates the most statistically valid clusters, minimizing distances within clusters and maximizing distances between clusters, such that the subgroups are based only on the data themselves, without intervention of the researcher . Recently, the feasibility of cluster analyses for untangling disease heterogeneity and identifying subgroups has gained interest …”
Section: Discussionmentioning
confidence: 99%
“…Although this does not mean that it is free of limitations and biases, this method is currently being applied in many health sciences fields. In this particular concept of grouping patients according to their diagnoses, though, a large number of studies on chronic obstructive pulmonary disease (COPD) stands out [3,[5][6][7][8]. With these studies, knowledge of diagnoses associated with COPD has not only improved, but it has also allowed an improvement of the statistical methodology to assist studies regarding high-dimensionality healthcare data.…”
Section: Introductionmentioning
confidence: 99%
“…With diagnostic variables, which are those that raise our concerns, we often find all of these problems: a large number of dichotomous and possible unnecessary variables and, very likely, high collinearity. The latter is especially important, since it could dominate patient assignments into clusters [5].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations