Background
Widespread implementation of electronic health records (EHR) has created new opportunities for pediatric oncology observational research. Little attention has been given to using EHR data to identify patients with pediatric hematologic malignancies.
Methods
This study used EHR‐derived data in a pediatric clinical data research network, PEDSnet, to develop and evaluate a computable phenotype algorithm to identify pediatric patients with leukemia and lymphoma who received treatment with chemotherapy. To guide early development, multiple computable phenotype‐defined cohorts were compared to one institution's tumor registry. The most promising algorithm was chosen for formal evaluation and consisted of at least two leukemia/lymphoma diagnoses (Systematized Nomenclature of Medicine codes) within a 90‐day period, two chemotherapy exposures, and three hematology‐oncology provider encounters. During evaluation, the computable phenotype was executed against EHR data from 2011 to 2016 at three large institutions. Classification accuracy was assessed by masked medical record review with phenotype‐identified patients compared to a control group with at least three hematology‐oncology encounters.
Results
The computable phenotype had sensitivity of 100% (confidence interval [CI] 99%, 100%), specificity of 99% (CI 99%, 100%), positive predictive value (PPV) and negative predictive value (NPV) of 100%, and C‐statistic of 1 at the development institution. The computable phenotype performance was similar at the two test institutions with sensitivity of 100% (CI 99%, 100%), specificity of 99% (CI 99%, 100%), PPV of 96%, NPV of 100%, and C‐statistic of 0.99.
Conclusion
The EHR‐based computable phenotype is an accurate cohort identification tool for pediatric patients with leukemia and lymphoma who have been treated with chemotherapy and is ready for use in clinical studies.