“… 26 , 27 Data-driven approaches for uncovering cohort heterogeneity can be divided into unsupervised techniques which ignore the outcome being studied, and supervised techniques, which include information about the outcome. Unsupervised techniques include clustering 10 , 28 , 29 and latent class analysis 30 , 31 of clinical phenotypes; supervised techniques include mixture of experts (MoE) 32 , 33 , 34 , 35 and subgroup discovery algorithms 17 , 36 to subdivide large cohorts. Following stratification of individuals and identification of heterogeneity present, techniques including hierarchical modeling of subcohorts and ensemble learning can then be employed to improve the prediction of CAD in the whole cohort.…”