“…Each of the 200 simulated data sets includes the following information for each individual: case-control status, three continuous quantitative traits (Q1, Q2, Q4), and three phenotypic features (Age, Smoking status, and Sex). We use a multidimensional scaling analysis based on genome-wide pairwise identity-by-state distances computed in PLINK [ 14 ] to determine three main continental population strata: African (Luhya, Luhya-additional, Yoruba-1, Yoruba-2, Yoruba-additional), Asian (Denver Chinese, Denver Chinese-additional, Han Chinese-1, Han Chinese-2, Han Chinese-additional, Japanese-1, Japanese-2, Japanese-additional), and European (CEPH-1, CEPH-2, Tuscan, and Tuscan-additional) [ 15 , 16 ]. We then generate three binary features to include in our model, assigning patients to their corresponding Asian, European, and African populations.…”