For many high-dimensional studies, additional information on the variables, like (genomic) annotation or external p-values, is available. In the context of binary and continuous prediction, we develop a method for adaptive group-regularized (logistic) ridge regression, which makes structural use of such 'co-data'. Here, 'groups' refer to a partition of the variables according to the co-data. We derive empirical Bayes estimates of group-specific penalties, which possess several nice properties: i) they are analytical; ii) they adapt to the informativeness of the co-data for the data at hand; iii) only one global penalty parameter requires tuning by cross-validation. In addition, the method allows use of multiple types of co-data at little extra computational effort.We show that the group-specific penalties may lead to a larger distinction between 'near-zero' and relatively large regression parameters, which facilitates post-hoc variable selection. The method, termed GRridge, is implemented in an easy-to-use R-package. It is demonstrated on two cancer genomics studies, which both concern the discrimination of precancerous cervical lesions from normal cervix tissues using methylation microarray data. For both examples, GRridge clearly improves the predictive performances of ordinary logistic ridge regression and the group lasso. In addition, we show that for the second study the relatively good predictive performance is maintained when selecting only 42 variables.
How mixtures of immune cells associate with cancer cell phenotype and affect pathogenesis is still unclear. In 15 breast cancer gene expression datasets, we invariably identify three clusters of patients with gradual levels of immune infiltration. The intermediate immune infiltration cluster (Cluster B) is associated with a worse prognosis independently of known clinicopathological features. Furthermore, immune clusters are associated with response to neoadjuvant chemotherapy. In silico dissection of the immune contexture of the clusters identified Cluster A as immune cold, Cluster C as immune hot while Cluster B has a pro-tumorigenic immune infiltration. Through phenotypical analysis, we find epithelial mesenchymal transition and proliferation associated with the immune clusters and mutually exclusive in breast cancers. Here, we describe immune clusters which improve the prognostic accuracy of immune contexture in breast cancer. Our discovery of a novel independent prognostic factor in breast cancer highlights a correlation between tumor phenotype and immune contexture.
BackgroundThe aim of this study was to investigate the prognostic value of the PAM50 intrinsic subtypes and risk of recurrence (ROR) score in patients with early breast cancer and long-term follow-up. A special focus was placed on hormone receptor-positive/human epidermal growth factor receptor 2-negative (HR+/HER2−) pN0 patients not treated with chemotherapy.MethodsPatients with early breast cancer (n = 653) enrolled in the observational Oslo1 study (1995–1998) were followed for distant recurrence and breast cancer death. Clinicopathological parameters were collected from hospital records. The primary tumors were analyzed using the Prosigna® PAM50 assay to determine the prognostic value of the intrinsic subtypes and ROR score in comparison with pathological characteristics. The primary endpoints were distant disease-free survival (DDFS) and breast cancer-specific survival (BCSS).ResultsOf 653 tumors, 52.2% were classified as luminal A, 26.5% as luminal B, 10.6% as HER2-enriched, and 10.7% as basal-like. Among the HR+/HER2− patients (n = 476), 37.8% were categorized as low risk by ROR score, 22.7% as intermediate risk, and 39.5% as high risk. Median follow-up durations for BCSS and DDFS were 16.6 and 7.1 years, respectively. Multivariate analysis showed that intrinsic subtypes (all patients) and ROR risk classification (HR+/HER2− patients) yielded strong prognostic information. Among the HR+/HER2− pN0 patients with no adjuvant treatment (n = 231), 53.7% of patients had a low ROR, and their prognosis at 15 years was excellent (15-year BCSS 96.3%). Patients with intermediate risk had reduced survival compared with those with low risk (p = 0.005). In contrast, no difference in survival between the low- and intermediate-risk groups was seen for HR+/HER2− pN0 patients who received tamoxifen only. Ki-67 protein, grade, and ROR score were analyzed in the unselected, untreated pT1pN0 HR+/HER2− population (n = 171). In multivariate analysis, ROR score outperformed both Ki-67 and grade. Furthermore, 55% of patients who according to the PREDICT tool (http://www.predict.nhs.uk/) would be considered chemotherapy candidates were ROR low risk (33%) or luminal A ROR intermediate risk (22%).ConclusionsThe PAM50 intrinsic subtype classification and ROR score improve classification of patients with breast cancer into prognostic groups, allowing for a more precise identification of future recurrence risk and providing an improved basis for adjuvant treatment decisions. Node-negative patients with low ROR scores had an excellent outcome at 15 years even in the absence of adjuvant therapy.Electronic supplementary materialThe online version of this article (doi:10.1186/s13058-017-0911-9) contains supplementary material, which is available to authorized users.
Combining genome-wide structural models with phenomenological data is at the forefront of efforts to understand the organizational principles regulating the human genome. Here, we use chromosome-chromosome contact data as knowledge-based constraints for large-scale three-dimensional models of the human diploid genome. The resulting models remain minimally entangled and acquire several functional features that are observed in vivo and that were never used as input for the model. We find, for instance, that gene-rich, active regions are drawn towards the nuclear center, while gene poor and lamina associated domains are pushed to the periphery. These and other properties persist upon adding local contact constraints, suggesting their compatibility with non-local constraints for the genome organization. The results show that suitable combinations of data analysis and physical modelling can expose the unexpectedly rich functionally-related properties implicit in chromosome-chromosome contact data. Specific directions are suggested for further developments based on combining experimental data analysis and genomic structural modelling.
Multigene assays for molecular subtypes and biomarkers can aid management of early invasive breast cancer. Using RNA-sequencing we aimed to develop single-sample predictor (SSP) models for clinical markers, subtypes, and risk of recurrence (ROR). A cohort of 7743 patients was divided into training and test set. We trained SSPs for subtypes and ROR assigned by nearest-centroid (NC) methods and SSPs for biomarkers from histopathology. Classifications were compared with Prosigna in two external cohorts (ABiM, n = 100 and OSLO2-EMIT0, n = 103). Prognostic value was assessed using distant recurrence-free interval. Agreement between SSP and NC for PAM50 (five subtypes) was high (85%, Kappa = 0.78) for Subtype (four subtypes) very high (90%, Kappa = 0.84) and for ROR risk category high (84%, Kappa = 0.75, weighted Kappa = 0.90). Prognostic value was assessed as equivalent and clinically relevant. Agreement with histopathology was very high or high for receptor status, while moderate for Ki67 status and poor for Nottingham histological grade. SSP and Prosigna concordance was high for subtype (OSLO-EMIT0 83%, Kappa = 0.73 and ABiM 80%, Kappa = 0.72) and moderate and high for ROR risk category (68 and 84%, Kappa = 0.50 and 0.70, weighted Kappa = 0.70 and 0.78). Pooled concordance for emulated treatment recommendation dichotomized for chemotherapy was high (85%, Kappa = 0.66). Retrospective evaluation suggested that SSP application could change chemotherapy recommendations for up to 17% of postmenopausal ER+/HER2-/N0 patients with balanced escalation and de-escalation. Results suggest that NC and SSP models are interchangeable on a group-level and nearly so on a patient level and that SSP models can be derived to closely match clinical tests.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.