BackgroundSystematic analysis of cancer gene-expression patterns using high-throughput transcriptional profiling technologies has led to the discovery and publication of hundreds of gene-expression signatures. However, few public signature values have been cross-validated over multiple studies for the prediction of cancer prognosis and chemosensitivity in the neoadjuvant setting.MethodsTo analyze the prognostic and predictive values of publicly available signatures, we have implemented a systematic method for high-throughput and efficient validation of a large number of datasets and gene-expression signatures. Using this method, we performed a meta-analysis including 351 publicly available signatures, 37,000 random signatures, and 31 breast cancer datasets. Survival analyses and pathologic responses were used to assess prediction of prognosis, chemoresponsiveness, and chemo-drug sensitivity.ResultsAmong 31 breast cancer datasets and 351 public signatures, we identified 22 validation datasets, two robust prognostic signatures (BRmet50 and PMID18271932Sig33) in breast cancer and one signature (PMID20813035Sig137) specific for prognosis prediction in patients with ER-negative tumors. The 22 validation datasets demonstrated enhanced ability to distinguish cancer gene profiles from random gene profiles. Both prognostic signatures are composed of genes associated with TP53 mutations and were able to stratify the good and poor prognostic groups successfully in 82%and 68% of the 22 validation datasets, respectively. We then assessed the abilities of the two signatures to predict treatment responses of breast cancer patients treated with commonly used chemotherapeutic regimens. Both BRmet50 and PMID18271932Sig33 retrospectively identified those patients with an insensitive response to neoadjuvant chemotherapy (mean positive predictive values 85%-88%). Among those patients predicted to be treatment sensitive, distant relapse-free survival (DRFS) was improved (negative predictive values 87%-88%). BRmet50 was further shown to prospectively predict taxane-anthracycline sensitivity in patients with HER2-negative (HER2-) breast cancer.ConclusionsWe have developed and applied a high-throughput screening method for public cancer signature validation. Using this method, we identified appropriate datasets for cross-validation and two robust signatures that differentiate TP53 mutation status and have prognostic and predictive value for breast cancer patients.Electronic supplementary materialThe online version of this article (doi:10.1186/s12885-015-1102-7) contains supplementary material, which is available to authorized users.
BackgroundRobust transcriptional signatures in cancer can be identified by data similarity-driven meta-analysis of gene expression profiles. An unbiased data integration and interrogation strategy has not previously been available.Methods and FindingsWe implemented and performed a large meta-analysis of breast cancer gene expression profiles from 223 datasets containing 10,581 human breast cancer samples using a novel data similarity-based approach (iterative EXALT). Cancer gene expression signatures extracted from individual datasets were clustered by data similarity and consolidated into a meta-signature with a recurrent and concordant gene expression pattern. A retrospective survival analysis was performed to evaluate the predictive power of a novel meta-signature deduced from transcriptional profiling studies of human breast cancer. Validation cohorts consisting of 6,011 breast cancer patients from 21 different breast cancer datasets and 1,110 patients with other malignancies (lung and prostate cancer) were used to test the robustness of our findings. During the iterative EXALT analysis, 633 signatures were grouped by their data similarity and formed 121 signature clusters. From the 121 signature clusters, we identified a unique meta-signature (BRmet50) based on a cluster of 11 signatures sharing a phenotype related to highly aggressive breast cancer. In patients with breast cancer, there was a significant association between BRmet50 and disease outcome, and the prognostic power of BRmet50 was independent of common clinical and pathologic covariates. Furthermore, the prognostic value of BRmet50 was not specific to breast cancer, as it also predicted survival in prostate and lung cancers.ConclusionsWe have established and implemented a novel data similarity-driven meta-analysis strategy. Using this approach, we identified a transcriptional meta-signature (BRmet50) in breast cancer, and the prognostic performance of BRmet50 was robust and applicable across a wide range of cancer-patient populations.
BackgroundCommunity associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data.MethodsThe algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis.ResultsAcross seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100% for cases and 96 to 100% for controls; sensitivity ranged from 94 to 100% for cases and 75 to 100% for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p < 5 E −8) findings.ConclusionsDifferences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary.Electronic supplementary materialThe online version of this article (doi:10.1186/s12879-016-2020-2) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.