Summary We consider the problem of estimating multiple related Gaussian graphical models from a high-dimensional data set with observations belonging to distinct classes. We propose the joint graphical lasso, which borrows strength across the classes in order to estimate multiple graphical models that share certain characteristics, such as the locations or weights of nonzero edges. Our approach is based upon maximizing a penalized log likelihood. We employ generalized fused lasso or group lasso penalties, and implement a fast ADMM algorithm to solve the corresponding convex optimization problems. The performance of the proposed method is illustrated through simulated and real data examples.
BackgroundBreast cancer cell lines have been used widely to investigate breast cancer pathobiology and new therapies. Breast cancer is a molecularly heterogeneous disease, and it is important to understand how well and which cell lines best model that diversity. In particular, microarray studies have identified molecular subtypes–luminal A, luminal B, ERBB2-associated, basal-like and normal-like–with characteristic gene-expression patterns and underlying DNA copy number alterations (CNAs). Here, we studied a collection of breast cancer cell lines to catalog molecular profiles and to assess their relation to breast cancer subtypes.MethodsWhole-genome DNA microarrays were used to profile gene expression and CNAs in a collection of 52 widely-used breast cancer cell lines, and comparisons were made to existing profiles of primary breast tumors. Hierarchical clustering was used to identify gene-expression subtypes, and Gene Set Enrichment Analysis (GSEA) to discover biological features of those subtypes. Genomic and transcriptional profiles were integrated to discover within high-amplitude CNAs candidate cancer genes with coordinately altered gene copy number and expression.FindingsTranscriptional profiling of breast cancer cell lines identified one luminal and two basal-like (A and B) subtypes. Luminal lines displayed an estrogen receptor (ER) signature and resembled luminal-A/B tumors, basal-A lines were associated with ETS-pathway and BRCA1 signatures and resembled basal-like tumors, and basal-B lines displayed mesenchymal and stem/progenitor-cell characteristics. Compared to tumors, cell lines exhibited similar patterns of CNA, but an overall higher complexity of CNA (genetically simple luminal-A tumors were not represented), and only partial conservation of subtype-specific CNAs. We identified 80 high-level DNA amplifications and 13 multi-copy deletions, and the resident genes with concomitantly altered gene-expression, highlighting known and novel candidate breast cancer genes.ConclusionsOverall, breast cancer cell lines were genetically more complex than tumors, but retained expression patterns with relevance to the luminal-basal subtype distinction. The compendium of molecular profiles defines cell lines suitable for investigations of subtype-specific pathobiology, cancer stem cell biology, biomarkers and therapies, and provides a resource for discovery of new breast cancer genes.
In this paper, we propose a computationally efficient approach —space(Sparse PArtial Correlation Estimation)— for selecting non-zero partial correlations under the high-dimension-low-sample-size setting. This method assumes the overall sparsity of the partial correlation matrix and employs sparse regression techniques for model fitting. We illustrate the performance of space by extensive simulation studies. It is shown that space performs well in both non-zero partial correlation selection and the identification of hub variables, and also outperforms two existing methods. We then apply space to a microarray breast cancer data set and identify a set of hub genes which may provide important insights on genetic regulatory networks. Finally, we prove that, under a set of suitable assumptions, the proposed procedure is asymptotically consistent in terms of model selection and parameter estimation.
The genome of an admixed individual represents a mixture of alleles from different ancestries.In the United States, the two largest minority groups, African Americans and Hispanics, are both admixed. An understanding of the admixture proportion at an individual level (individual admixture, or IA) is valuable for both population geneticists and epidemiologists who conduct case-control association studies in these groups. Here we present an extension of a previously described frequentist (maximum likelihood or ML) approach to estimate individual admixture that allows for uncertainty in ancestral allele frequencies. We compare this approach both to prior partial likelihood based methods as well as more recently described Bayesian MCMC methods.Our full ML method demonstrates increased robustness when compared to an existing partial ML approach. Simulations also suggest that this frequentist estimator achieves similar efficiency, measured by the mean squared error criterion, as Bayesian methods but requires just a tiny fraction of the computational time to produce point estimates, allowing for extensive analysis (e.g. simulations) not possible by Bayesian methods. Our simulation results demonstrate that inclusion of ancestral populations or their surrogates in the analysis is required by any method of IA estimation to obtain reasonable results.keywords: admixture, EM algorithm, maximum likelihood estimate.
Breast cancer is a leading cause of cancer-death among women, where the clinicopathological features of tumors are used to prognosticate and guide therapy. DNA copy number alterations (CNAs), which occur frequently in breast cancer and define key pathogenetic events, are also potentially useful prognostic or predictive factors. Here, we report a genome-wide array-based comparative genomic hybridization (array CGH) survey of CNAs in 89 breast tumors from a patient cohort with locally advanced disease. Statistical analysis links distinct cytoband loci harboring CNAs to specific clinicopathological parameters, including tumor grade, estrogen receptor status, presence of TP53 mutation, and overall survival. Notably, distinct spectra of CNAs also underlie the different subtypes of breast cancer recently defined by expression-profiling, implying these subtypes develop along distinct genetic pathways. In addition, higher numbers of gains/losses are associated with the "basal-like" tumor subtype, while high-level DNA amplification is more frequent in "luminal-B" subtype tumors, suggesting also that distinct mechanisms of genomic instability might underlie their pathogenesis. The identified CNAs may provide a basis for improved patient prognostication, as well as a starting point to define important genes to further our understanding of the pathobiology of breast cancer. This article contains Supplementary Material available at http://www.interscience.wiley.com/jpages/1045-2257/suppmat
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.