Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93–0.94), 30-day unplanned readmission (AUROC 0.75–0.76), prolonged length of stay (AUROC 0.85–0.86), and all of a patient’s final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient’s chart.
Summary Tumor heterogeneity is a major barrier to effective cancer diagnosis and treatment. We recently identified cancer-specific differentially DNA-methylated regions (cDMRs) in colon cancer, which also distinguish normal tissue types from each other, suggesting that these cDMRs might be generalized across cancer types. Here we show stochastic methylation variation of the same cDMRs, distinguishing cancer from normal, in colon, lung, breast, thyroid, and Wilms tumors, with intermediate variation in adenomas. Whole genome bisulfite sequencing shows these variable cDMRs are related to loss of sharply delimited methylation boundaries at CpG islands. Furthermore, we find hypomethylation of discrete blocks encompassing half the genome, with extreme gene expression variability. Genes associated with the cDMRs and large blocks are involved in mitosis and matrix remodeling, respectively. These data suggest a model for cancer involving loss of epigenetic stability of well-defined genomic domains that underlies increased methylation variability in cancer and could contribute to tumor heterogeneity.
Defined transcription factors can induce epigenetic reprogramming of adult mammalian cells into induced pluripotent stem cells. Although DNA factors are integrated during some reprogramming methods, it is unknown whether the genome remains unchanged at the single nucleotide level. Here we show that 22 human induced pluripotent stem (hiPS) cell lines reprogrammed using five different methods each contained an average of five protein-coding point mutations in the regions sampled (an estimated six protein coding point mutations per exome). The majority of these mutations were non-synonymous, nonsense, or splice variants, and were enriched in genes mutated or having causative effects in cancers. At least half of these reprogramming-associated mutations pre-existed in fibroblast progenitors at low frequencies, while the rest were newly occurring during or after reprogramming. Thus, hiPS cells acquire genetic modifications in addition to epigenetic modifications. Extensive genetic screening should become a standard procedure to ensure hiPS safety before clinical use.
Identifying genomic regions that have been targets of natural selection remains one of the most important and challenging areas of research in genetics. To this end, we report an analysis of 26,530 single nucleotide polymorphisms (SNPs) with allele frequencies that were determined in three populations. Specifically, we calculated a measure of genetic differentiation, FST, for each locus and examined its distribution at the level of the genome, the chromosome, and individual genes. Through a variety of analyses, we have found statistically significant evidence supporting the hypothesis that selection has influenced extant patterns of human genetic variation. Importantly, by contrasting the FST of individual SNPs to the empirical genome-wide distribution of FST, our results are not confounded by tenuous assumptions of population demographic history. Furthermore, we have identified 174 candidate genes with distribution of genetic variation that indicates that they have been targets of selection. Our work provides a first generation natural selection map of the human genome and provides compelling evidence that selection has shaped extant patterns of human genomic variation
BackgroundPrescription opioid–related overdose deaths increased sharply during 1999–2010 in the United States in parallel with increased opioid prescribing. CDC assessed changes in national-level and county-level opioid prescribing during 2006–2015.MethodsCDC analyzed retail prescription data from QuintilesIMS to assess opioid prescribing in the United States from 2006 to 2015, including rates, amounts, dosages, and durations prescribed. CDC examined county-level prescribing patterns in 2010 and 2015.ResultsThe amount of opioids prescribed in the United States peaked at 782 morphine milligram equivalents (MME) per capita in 2010 and then decreased to 640 MME per capita in 2015. Despite significant decreases, the amount of opioids prescribed in 2015 remained approximately three times as high as in 1999 and varied substantially across the country. County-level factors associated with higher amounts of prescribed opioids include a larger percentage of non-Hispanic whites; a higher prevalence of diabetes and arthritis; micropolitan status (i.e., town/city; nonmetro); and higher unemployment and Medicaid enrollment.Conclusions and Implications for Public Health PracticeDespite reductions in opioid prescribing in some parts of the country, the amount of opioids prescribed remains high relative to 1999 levels and varies substantially at the county-level. Given associations between opioid prescribing, opioid use disorder, and overdose rates, health care providers should carefully weigh the benefits and risks when prescribing opioids outside of end-of-life care, follow evidence-based guidelines, such as CDC’s Guideline for Prescribing Opioids for Chronic Pain, and consider nonopioid therapy for chronic pain treatment. State and local jurisdictions can use these findings combined with Prescription Drug Monitoring Program data to identify areas with prescribing patterns that place patients at risk for opioid use disorder and overdose and to target interventions with prescribers based on opioid prescribing guidelines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.