Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.
BackgroundThe CpG island methylator phenotype (CIMP) is a distinct phenotype associated with microsatellite instability (MSI) and BRAF mutation in colon cancer. Recent investigations have selected 5 promoters (CACNA1G, IGF2, NEUROG1, RUNX3 and SOCS1) as surrogate markers for CIMP-high. However, no study has comprehensively evaluated an expanded set of methylation markers (including these 5 markers) using a large number of tumors, or deciphered the complex clinical and molecular associations with CIMP-high determined by the validated marker panel.Metholodology/Principal FindingsDNA methylation at 16 CpG islands [the above 5 plus CDKN2A (p16), CHFR, CRABP1, HIC1, IGFBP3, MGMT, MINT1, MINT31, MLH1, p14 (CDKN2A/ARF) and WRN] was quantified in 904 colorectal cancers by real-time PCR (MethyLight). In unsupervised hierarchical clustering analysis, the 5 markers (CACNA1G, IGF2, NEUROG1, RUNX3 and SOCS1), CDKN2A, CRABP1, MINT31, MLH1, p14 and WRN were generally clustered with each other and with MSI and BRAF mutation. KRAS mutation was not clustered with any methylation marker, suggesting its association with a random methylation pattern in CIMP-low tumors. Utilizing the validated CIMP marker panel (including the 5 markers), multivariate logistic regression demonstrated that CIMP-high was independently associated with older age, proximal location, poor differentiation, MSI-high, BRAF mutation, and inversely with LINE-1 hypomethylation and β-catenin (CTNNB1) activation. Mucinous feature, signet ring cells, and p53-negativity were associated with CIMP-high in only univariate analysis. In stratified analyses, the relations of CIMP-high with poor differentiation, KRAS mutation and LINE-1 hypomethylation significantly differed according to MSI status.ConclusionsOur study provides valuable data for standardization of the use of CIMP-high-specific methylation markers. CIMP-high is independently associated with clinical and key molecular features in colorectal cancer. Our data also suggest that KRAS mutation is related with a random CpG island methylation pattern which may lead to CIMP-low tumors.
Background At least four major categories of invasive breast cancer have been reproducibly identified by gene expression profiling: luminal A, luminal B, HER2-type and basal-like. These subtypes have been shown to differ in their outcome and response to treatment. Whether this heterogeneity reflects the evolution of these subtypes through distinct etiologic pathways has not been clearly defined. Methods We evaluated the association between traditional breast cancer risk factors and risk of previously defined molecular subtypes of breast cancer in the Nurses’ Health Study. This analysis included 2,022 invasive breast cancer cases for whom we were able to obtain archived breast cancer tissue specimens. Tissue microarrays (TMAs) were constructed and slides were immunostained for estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), cytokeratin 5/6 (CK5/6), and epidermal growth factor receptor (EGFR). Using immunostain results in combination with histologic grade, cases were grouped into molecularly defined subtypes. We used Cox proportional hazards models to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). Results We observed differences in the association between risk factors and subtypes of breast cancer. In general, many reproductive factors were most strongly associated with the luminal A subtype, although these differences were not statistically significant. Weight gain since age 18 showed significant differences in its association with molecular subtypes (p-heterogeneity=0.05) and was most strongly associated with the luminal B subtype (p-trend 0.001). Although there was not significant heterogeneity for lactation across subtypes, an inverse association was strongest for basal-like tumors (HR=0.6, 95%CI 0.4–0.8; p-heterogeneity=0.88). Conclusions These results support the hypothesis that different subtypes of breast cancer have different etiologies and should not be considered as a single group. Identifying risk factors for less common subtypes such as luminal B, HER2-type and basal-like tumors has important implications for prevention of these more aggressive subtypes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.