BackgroundSegmentation of heterogeneous patient populations into parsimonious and relatively homogenous groups with similar healthcare needs can facilitate healthcare resource planning and development of effective integrated healthcare interventions for each segment. We aimed to apply a data-driven, healthcare utilization-based clustering analysis to segment a regional health system patient population and validate its discriminative ability on 4-year longitudinal healthcare utilization and mortality data.MethodsWe extracted data from the Singapore Health Services Electronic Health Intelligence System, an electronic medical record database that included healthcare utilization (inpatient admissions, specialist outpatient clinic visits, emergency department visits, and primary care clinic visits), mortality, diseases, and demographics for all adult Singapore residents who resided in and had a healthcare encounter with our regional health system in 2012. Hierarchical clustering analysis (Ward’s linkage) and K-means cluster analysis using age and healthcare utilization data in 2012 were applied to segment the selected population. These segments were compared using their demographics (other than age) and morbidities in 2012, and longitudinal healthcare utilization and mortality from 2013–2016.ResultsAmong 146,999 subjects, five distinct patient segments “Young, healthy”; “Middle age, healthy”; “Stable, chronic disease”; “Complicated chronic disease” and “Frequent admitters” were identified. Healthcare utilization patterns in 2012, morbidity patterns and demographics differed significantly across all segments. The “Frequent admitters” segment had the smallest number of patients (1.79% of the population) but consumed 69% of inpatient admissions, 77% of specialist outpatient visits, 54% of emergency department visits, and 23% of primary care clinic visits in 2012. 11.5% and 31.2% of this segment has end stage renal failure and malignancy respectively. The validity of cluster-analysis derived segments is supported by discriminative ability for longitudinal healthcare utilization and mortality from 2013–2016. Incident rate ratios for healthcare utilization and Cox hazards ratio for mortality increased as patient segments increased in complexity. Patients in the “Frequent admitters” segment accounted for a disproportionate healthcare utilization and 8.16 times higher mortality rate.ConclusionOur data-driven clustering analysis on a general patient population in Singapore identified five patient segments with distinct longitudinal healthcare utilization patterns and mortality risk to provide an evidence-based segmentation of a regional health system’s healthcare needs.