2016
DOI: 10.1186/s12882-016-0238-2
|View full text |Cite
|
Sign up to set email alerts
|

Cluster analysis and its application to healthcare claims data: a study of end-stage renal disease patients who initiated hemodialysis

Abstract: BackgroundCluster analysis (CA) is a frequently used applied statistical technique that helps to reveal hidden structures and “clusters” found in large data sets. However, this method has not been widely used in large healthcare claims databases where the distribution of expenditure data is commonly severely skewed. The purpose of this study was to identify cost change patterns of patients with end-stage renal disease (ESRD) who initiated hemodialysis (HD) by applying different clustering methods.MethodsA retr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
55
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 92 publications
(56 citation statements)
references
References 31 publications
0
55
0
1
Order By: Relevance
“…The meaning of this study is to identify the actual burden on ESRD because the cost studies in the past have been based on insurance claims data [17,[29][30][31][32][33][34]. The detailed meaning is as follows.…”
Section: Discussionmentioning
confidence: 99%
“…The meaning of this study is to identify the actual burden on ESRD because the cost studies in the past have been based on insurance claims data [17,[29][30][31][32][33][34]. The detailed meaning is as follows.…”
Section: Discussionmentioning
confidence: 99%
“…Aim (2) Identify factors associated with high/low health service usage/cost: We will develop a detailed data analytical plan to undertake statistical analyses of cost data and identify factors associated with high/low health service usage/cost, such as socio-demographic (e.g., geographic location, ethnicity, age, sex) and clinical characteristics (e.g., type of cancer, time since diagnosis). We will consider appropriate statistical approaches, such as non-parametric bootstrapping [40], cluster analysis [41], quantile regression analyses [42], two-part models (if zero values are an issue), quintile regression (splitting cost outcomes into 5 levels) and Generalized Linear Models (GLM) [43]. The latter are particularly suitable for highly right-skewed data, such as health care costs.…”
Section: Data Storagementioning
confidence: 99%
“…Within healthcare, clustering techniques have been largely used to identify patient groups. K-means clustering appears to be very widely used in identifying patient groups which have high degrees of similarity (Tomar and Agarwal 2013;Liao et al 2016). To take one example, clustering aids in the identification of entity groups (e.g.…”
Section: Data Analytics Modelmentioning
confidence: 99%