Objective
The heterogeneity of pediatric sepsis patients suggests the potential benefits of clustering analytics to derive phenotypes with distinct host response patterns that may help guide personalized therapeutics. We evaluate the relative performance of latent class analysis (LCA) and K‐means, 2 commonly used clustering methods toward the derivation of clinically useful pediatric sepsis phenotypes.
Methods
Data were extracted from anonymized medical records of 6446 pediatric patients that presented to 1 of 6 emergency departments (EDs) between 2013 and 2018 and were thereafter admitted. Using International Classification of Diseases (ICD)‐9 and ICD‐10 discharge codes, 151 patients were identified with a sepsis continuum diagnosis that included septicemia, sepsis, severe sepsis, and septic shock. Using feature sets used in related clustering studies, LCA and K‐means algorithms were used to derive 4 distinct phenotypic pediatric sepsis segmentations. Each segmentation was evaluated for phenotypic homogeneity, separation, and clinical use.
Results
Using the 2 feature sets, LCA clustering resulted in 2 similar segmentations of 4 clinically distinct phenotypes, while K‐means clustering resulted in segmentations of 3 and 4 phenotypes. All 4 segmentations identified at least 1 high severity phenotype, but LCA‐identified phenotypes reflected superior stratification, high entropy approaching 1 (eg, 0.994) indicating excellent separation between estimated phenotypes, and differential treatment/treatment response, and outcomes that were non‐randomly distributed across phenotypes (
P
< 0.001).
Conclusion
Compared to K‐means, which is commonly used in clustering studies, LCA appears to be a more robust, clinically useful statistical tool in analyzing a heterogeneous pediatric sepsis cohort toward informing targeted therapies. Additional prospective studies are needed to validate clinical utility of predictive models that target derived pediatric sepsis phenotypes in emergency department settings.