Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.
Genome wide association studies (GWAS) and large scale replication studies have identified common variants in 79 loci associated with breast cancer, explaining ~14% of the familial risk of the disease. To identify new susceptibility loci, we performed a meta-analysis of 11 GWAS comprising of 15,748 breast cancer cases and 18,084 controls, and 46,785 cases and 42,892 controls from 41 studies genotyped on a 200K custom array (iCOGS). Analyses were restricted to women of European ancestry. Genotypes for more than 11M SNPs were generated by imputation using the 1000 Genomes Project reference panel. We identified 15 novel loci associated with breast cancer at P<5×10−8. Combining association analysis with ChIP-Seq data in mammary cell lines and ChIA-PET chromatin interaction data in ENCODE, we identified likely target genes in two regions: SETBP1 on 18q12.3 and RNF115 and PDZK1 on 1q21.1. One association appears to be driven by an amino-acid substitution in EXO1.
Alterations in DNA methylation frequently occur in hepatocellular cancer (HCC). We have previously demonstrated that hypermethylation in candidate genes can be detected in plasma DNA prior to HCC diagnosis. To identify with a genome-wide approach additional genes hypermethylated in HCC that could be used for more accurate analysis of plasma DNA for early diagnosis, we analyzed tumor and adjacent non-tumor tissues from 62 Taiwanese HCC cases using Illumina methylation arrays that screen 26,486 autosomal CpG sites. After Bonferroni adjustment, a total of 2,324 CpG sites significantly differed in methylation level, with 684 CpG sites significantly hypermethylated and 1,640 hypomethylated in tumor compared to non-tumor tissues. Array data were validated with pyrosequencing in a subset of 5 of these genes; correlation coefficients ranged from 0.92 to 0.97. Analysis of plasma DNA from 38 cases demonstrated that 37% to 63% of cases had detectable hypermethylated DNA (≥5% methylation) for these 5 genes individually. At least one of these genes was hypermethylated in 87% of cases, suggesting that measurement of DNA methylation in plasma samples is feasible. The panel of methylated genes indentified in the current study will be further tested in large cohort of prospectively collected samples to determine their utility as early biomarkers of hepatocellular carcinoma.
Arsenic contamination of drinking water is a major public health issue in many countries, increasing risk for a wide array of diseases, including cancer. There is inter-individual variation in arsenic metabolism efficiency and susceptibility to arsenic toxicity; however, the basis of this variation is not well understood. Here, we have performed the first genome-wide association study (GWAS) of arsenic-related metabolism and toxicity phenotypes to improve our understanding of the mechanisms by which arsenic affects health. Using data on urinary arsenic metabolite concentrations and approximately 300,000 genome-wide single nucleotide polymorphisms (SNPs) for 1,313 arsenic-exposed Bangladeshi individuals, we identified genome-wide significant association signals (P<5×10−8) for percentages of both monomethylarsonic acid (MMA) and dimethylarsinic acid (DMA) near the AS3MT gene (arsenite methyltransferase; 10q24.32), with five genetic variants showing independent associations. In a follow-up analysis of 1,085 individuals with arsenic-induced premalignant skin lesions (the classical sign of arsenic toxicity) and 1,794 controls, we show that one of these five variants (rs9527) is also associated with skin lesion risk (P = 0.0005). Using a subset of individuals with prospectively measured arsenic (n = 769), we show that rs9527 interacts with arsenic to influence incident skin lesion risk (P = 0.01). Expression quantitative trait locus (eQTL) analyses of genome-wide expression data from 950 individual's lymphocyte RNA suggest that several of our lead SNPs represent cis-eQTLs for AS3MT (P = 10−12) and neighboring gene C10orf32 (P = 10−44), which are involved in C10orf32-AS3MT read-through transcription. This is the largest and most comprehensive genomic investigation of arsenic metabolism and toxicity to date, the only GWAS of any arsenic-related trait, and the first study to implicate 10q24.32 variants in both arsenic metabolism and arsenical skin lesion risk. The observed patterns of associations suggest that MMA% and DMA% have distinct genetic determinants and support the hypothesis that DMA is the less toxic of these two methylated arsenic species. These results have potential translational implications for the prevention and treatment of arsenic-associated toxicities worldwide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.