“…Comprising heterogeneous clinical encounters including diagnostic and procedural billing codes, lab tests, prescriptions, and free text clinical notes for millions of patients, these rich data offer abundant opportunities for in silico epidemiological analysis. One application that has garnered recent interest is estimation of population disease risk within EHR patient cohorts, which can enable more powerful and precise estimation of real-world disease risks as well as comparative effectiveness analysis of alternative treatment strategies (Hodgkins and others , 2017; Dean and others , 2003; Liu and others , 2018; Panahiazar and others , 2015; Steele and others , 2018). Several studies have had success estimating time to death within rule-defined disease cohorts (Panahiazar and others , 2015; Steele and others , 2018).…”