2022
DOI: 10.3390/thalassrep12010004
|View full text |Cite
|
Sign up to set email alerts
|

Random Forest Clustering Identifies Three Subgroups of β-Thalassemia with Distinct Clinical Severity

Abstract: In this work, we aimed to establish subgroups of clinical severity in a global cohort of β-thalassemia through unsupervised random forest (RF) clustering. We used a large global dataset of 7910 β-thalassemia patients and evaluated 19 indicators of phenotype severity (IPhS) to determine their contribution and relatedness in grouping β-thalassemia patients into clusters using RF analysis. RF clustering suggested that three clusters with minimal overlapping exist (classification error rate: 4.3%), and six importa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

1
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 26 publications
1
1
0
Order By: Relevance
“…The three phenogroups were comparable in terms of the frequency of regular transfusions. This finding recalls a recent study that applied unsupervised RF clustering to identify subgroups of clinical severity in a large cohort of β-thalassemia patients, including TM patients, regularly transfused TI patients, and non-transfused TI patients [73]. Nineteen indicators of phenotype severity, which did not include MRI data, were considered, and the presence of regular transfusions did not play a significant role in grouping the patients into phenogroups.…”
Section: Discussionsupporting
confidence: 65%
“…The three phenogroups were comparable in terms of the frequency of regular transfusions. This finding recalls a recent study that applied unsupervised RF clustering to identify subgroups of clinical severity in a large cohort of β-thalassemia patients, including TM patients, regularly transfused TI patients, and non-transfused TI patients [73]. Nineteen indicators of phenotype severity, which did not include MRI data, were considered, and the presence of regular transfusions did not play a significant role in grouping the patients into phenogroups.…”
Section: Discussionsupporting
confidence: 65%
“…In comparison to PAM alone, RF clustering (RF-derived proximity measure combined with PAM clustering) can be used to handle nonlinear data with possible outliers and higher noise-signal ratio. RF clustering has been shown to be an effective method to determine the underlying structure of unlabeled data (38)(39)(40). RF was performed with 10,000 trees without resampling or replication.…”
Section: Discussionmentioning
confidence: 99%