For over a decade the term “Big data” has been used to describe the rapid increase in volume, variety and velocity of information available, not just in medical research but in almost every aspect of our lives. As scientists, we now have the capacity to rapidly generate, store and analyse data that, only a few years ago, would have taken many years to compile. However, “Big data” no longer means what it once did. The term has expanded and now refers not to just large data volume, but to our increasing ability to analyse and interpret those data. Tautologies such as “data analytics” and “data science” have emerged to describe approaches to the volume of available information as it grows ever larger. New methods dedicated to improving data collection, storage, cleaning, processing and interpretation continue to be developed, although not always by, or for, medical researchers. Exploiting new tools to extract meaning from large volume information has the potential to drive real change in clinical practice, from personalized therapy and intelligent drug design to population screening and electronic health record mining. As ever, where new technology promises “Big Advances,” significant challenges remain. Here we discuss both the opportunities and challenges posed to biomedical research by our increasing ability to tackle large datasets. Important challenges include the need for standardization of data content, format, and clinical definitions, a heightened need for collaborative networks with sharing of both data and expertise and, perhaps most importantly, a need to reconsider how and when analytic methodology is taught to medical researchers. We also set “Big data” analytics in context: recent advances may appear to promise a revolution, sweeping away conventional approaches to medical science. However, their real promise lies in their synergy with, not replacement of, classical hypothesis-driven methods. The generation of novel, data-driven hypotheses based on interpretable models will always require stringent validation and experimental testing. Thus, hypothesis-generating research founded on large datasets adds to, rather than replaces, traditional hypothesis driven science. Each can benefit from the other and it is through using both that we can improve clinical practice.
Summary Background Rare genetic variants cause pulmonary arterial hypertension, but the contribution of common genetic variation to disease risk and natural history is poorly characterised. We tested for genome-wide association for pulmonary arterial hypertension in large international cohorts and assessed the contribution of associated regions to outcomes. Methods We did two separate genome-wide association studies (GWAS) and a meta-analysis of pulmonary arterial hypertension. These GWAS used data from four international case-control studies across 11 744 individuals with European ancestry (including 2085 patients). One GWAS used genotypes from 5895 whole-genome sequences and the other GWAS used genotyping array data from an additional 5849 individuals. Cross-validation of loci reaching genome-wide significance was sought by meta-analysis. Conditional analysis corrected for the most significant variants at each locus was used to resolve signals for multiple associations. We functionally annotated associated variants and tested associations with duration of survival. All-cause mortality was the primary endpoint in survival analyses. Findings A locus near SOX17 (rs10103692, odds ratio 1·80 [95% CI 1·55–2·08], p=5·13 × 10 –15 ) and a second locus in HLA-DPA1 and HLA-DPB1 (collectively referred to as HLA-DPA1/DPB1 here; rs2856830, 1·56 [1·42–1·71], p=7·65 × 10 –20 ) within the class II MHC region were associated with pulmonary arterial hypertension. The SOX17 locus had two independent signals associated with pulmonary arterial hypertension (rs13266183, 1·36 [1·25–1·48], p=1·69 × 10 –12 ; and rs10103692). Functional and epigenomic data indicate that the risk variants near SOX17 alter gene regulation via an enhancer active in endothelial cells. Pulmonary arterial hypertension risk variants determined haplotype-specific enhancer activity, and CRISPR-mediated inhibition of the enhancer reduced SOX17 expression. The HLA-DPA1/DPB1 rs2856830 genotype was strongly associated with survival. Median survival from diagnosis in patients with pulmonary arterial hypertension with the C/C homozygous genotype was double (13·50 years [95% CI 12·07 to >13·50]) that of those with the T/T genotype (6·97 years [6·02–8·05]), despite similar baseline disease severity. Interpretation This is the first study to report that common genetic variation at loci in an enhancer near SOX17 and in HLA-DPA1/DPB1 is associated with pulmonary arterial hypertension. Impairment of SOX17 function might be more common in pulmonary arterial hypertension than suggested by rare mutations in ...
The use of electronic medical record data linked to biological specimens in health care settings is expected to enable cost-effective and rapid genomic analyses. Here, we present a model that highlights potential advantages for genomic discovery and describe the operational infrastructure that facilitated multiple simultaneous discovery efforts.
While many phenotypes have been associated with variants in human leukocyte antigen (HLA) genes, the full phenotypic impact of HLA variants across all diseases is unknown. We imputed HLA genomic variation from two populations of 28,839 and 8,431 European ancestry individuals and tested association of HLA variation with 1,368 phenotypes. A total of 104 four-digit and 92 two-digit HLA allele-phenotype associations were significant in both discovery and replication cohorts, the strongest being HLA-DQB1*03:02 and type 1 diabetes. Four previously unidentified associations were identified across the spectrum of disease with two and four digit HLA alleles and ten with non-synonymous variants. Some conditions associated with multiple HLA variants and stronger associations with more severe disease manifestations were identified. A comprehensive, publicly-available catalog of clinical phenotypes associated HLA variation is provided. Examining HLA variant disease associations in this large dataset allows comprehensive definition of disease associations to drive further mechanistic insights.
Phenytoin is an antiepileptic drug with a narrow therapeutic index and large interpatient pharmacokinetic variability, partly due to genetic variation in CYP2C9. Furthermore, the variant allele HLA-B*15:02 is associated with an increased risk of Stevens-Johnson syndrome and toxic epidermal necrolysis in response to phenytoin treatment. We summarize evidence from the published literature supporting these associations and provide therapeutic recommendations for the use of phenytoin based on CYP2C9 and/or HLA-B genotypes (updates on cpicpgx.org). The purpose of this guideline is to provide information for the interpretation of human leukocyte antigen B (HLA-B) and/or cytochrome P450 2C9 (CYP2C9) genotype test results to guide use and/or dosing of phenytoin. Guidelines for phenytoin use and cost-effectiveness of genetic testing are outside the scope of this report. This guideline updates the 2014 Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for CYP2C9 and HLA-B Genotypes and Phenytoin Dosing. 1 CPIC guidelines are periodically updated at www.cpicp gx.org. FOCUSED LITERATURE REVIEW We reviewed literature focused on CYP2C9 and HLA variation and phenytoin use (details in Supplementary Material). Evidence is summarized in Table S1 and Table S2. Genes: HLA-B and CYP2C9 Background. This guideline discusses HLA-B and the risk of Stevens-Johnson syndrome (SJS) and toxic epidermal necrolysis (TEN) with phenytoin and CYP2C9 as it relates to phenytoin metabolism and dosing. Updated CYP2C9 allele function assignments are provided using the activity score system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.