The selection of effective genes that accurately predict chemotherapy responses might improve cancer outcomes. We compare optimized gene signatures for cisplatin, carboplatin, and oxaliplatin responses in the same cell lines and validate each signature using data from patients with cancer. Supervised support vector machine learning is used to derive gene sets whose expression is related to the cell line GI50 values by backwards feature selection with cross-validation. Specific genes and functional pathways distinguishing sensitive from resistant cell lines are identified by contrasting signatures obtained at extreme and median GI50 thresholds. Ensembles of gene signatures at different thresholds are combined to reduce the dependence on specific GI50 values for predicting drug responses. The most accurate gene signatures for each platin are: cisplatin: BARD1, BCL2, BCL2L1, CDKN2C, FAAP24, FEN1, MAP3K1, MAPK13, MAPK3, NFKB1, NFKB2, SLC22A5, SLC31A2, TLR4, and TWIST1; carboplatin: AKT1, EIF3K, ERCC1, GNGT1, GSR, MTHFR, NEDD4L, NLRP1, NRAS, RAF1, SGK1, TIGD1, TP53, VEGFB, and VEGFC; and oxaliplatin: BRAF, FCGR2A, IGF1, MSH2, NAGK, NFE2L2, NQO1, PANK3, SLC47A1, SLCO1B1, and UGT1A1. Data from The Cancer Genome Atlas (TCGA) patients with bladder, ovarian, and colorectal cancer were used to test the cisplatin, carboplatin, and oxaliplatin signatures, resulting in 71.0%, 60.2%, and 54.5% accuracies in predicting disease recurrence and 59%, 61%, and 72% accuracies in predicting remission, respectively. One cisplatin signature predicted 100% of recurrence in non-smoking patients with bladder cancer (57% disease-free; N = 19), and 79% recurrence in smokers (62% disease-free; N = 35). This approach should be adaptable to other studies of chemotherapy responses, regardless of the drug or cancer types.
Gene signatures derived from transcriptomic data using machine Background: learning methods have shown promise for biodosimetry testing. These signatures may not be sufficiently robust for large scale testing, as their performance has not been adequately validated on external, independent datasets. The present study develops human and murine signatures with biochemically-inspired machine learning that are strictly validated using k-fold and traditional approaches.Gene Expression Omnibus (GEO) datasets of exposed human and Methods: murine lymphocytes were preprocessed via nearest neighbor imputation and expression of genes implicated in the literature to be responsive to radiation exposure (n=998) were then ranked by Minimum Redundancy Maximum Relevance (mRMR). Optimal signatures were derived by backward, complete, and forward sequential feature selection using Support Vector Machines (SVM), and validated using k-fold or traditional validation on independent datasets.The best human signatures we derived exhibit k-fold , and ) when validated over 85 samples. Some human ENO1 PPM1D signatures are specific enough to differentiate between chemotherapy and radiotherapy. Certain multi-class murine signatures have sufficient granularity in dose estimation to inform eligibility for cytokine therapy (assuming these signatures could be translated to humans). We compiled a list of the most frequently appearing genes in the top 20 human and mouse signatures. More frequently appearing genes among an ensemble of signatures may indicate greater impact of these genes on the performance of individual signatures. Several genes in the signatures we derived are present in previously proposed signatures.Gene signatures for ionizing radiation exposure derived by Conclusions: 2018, 7:233 Last updated: 20 MAR 2019 Gene signatures for ionizing radiation exposure derived by Conclusions: machine learning have low error rates in externally validated, independent datasets, and exhibit high specificity and granularity for dose estimation.
Background: Gene signatures derived from transcriptomic data using machine learning methods have shown promise for biodosimetry testing. These signatures may not be sufficiently robust for large scale testing, as their performance has not been adequately validated on external, independent datasets. The present study develops human and murine signatures with biochemically-inspired machine learning that are strictly validated using k-fold and traditional approaches. Methods: Gene Expression Omnibus (GEO) datasets of exposed human and murine lymphocytes were preprocessed via nearest neighbor imputation and expression of genes implicated in the literature to be responsive to radiation exposure (n=998) were then ranked by Minimum Redundancy Maximum Relevance (mRMR). Optimal signatures were derived by backward, complete, and forward sequential feature selection using Support Vector Machines (SVM), and validated using k-fold or traditional validation on independent datasets. Results: The best human signatures we derived exhibit k-fold validation accuracies of up to 98% ( DDB2, PRKDC, TPP2, PTPRE, and GADD45A) when validated over 209 samples and traditional validation accuracies of up to 92% ( DDB2, CD8A, TALDO1, PCNA, EIF4G2, LCN2, CDKN1A, PRKCH, ENO1, and PPM1D) when validated over 85 samples. Some human signatures are specific enough to differentiate between chemotherapy and radiotherapy. Certain multi-class murine signatures have sufficient granularity in dose estimation to inform eligibility for cytokine therapy (assuming these signatures could be translated to humans). We compiled a list of the most frequently appearing genes in the top 20 human and mouse signatures. More frequently appearing genes among an ensemble of signatures may indicate greater impact of these genes on the performance of individual signatures. Several genes in the signatures we derived are present in previously proposed signatures. Conclusions: Gene signatures for ionizing radiation exposure derived by machine learning have low error rates in externally validated, independent datasets, and exhibit high specificity and granularity for dose estimation.
Selection of effective genes that accurately predict chemotherapy response could improve cancer outcomes. We compare optimized gene signatures for cisplatin, carboplatin, and oxaliplatin response in the same cell lines, and respectively validate each with cancer patient data. Supervised support vector machine learning was used to derive gene sets whose expression was related to cell line GI50 values by backwards feature selection with cross-validation. Specific genes and functional pathways distinguishing sensitive from resistant cell lines are identified by contrasting signatures obtained at extreme vs. median GI50 thresholds. Ensembles of gene signatures at different thresholds are combined to reduce dependence on specific GI50 values for predicting drug response. The most accurate models for each platin are: cisplatin: BARD1, BCL2, BCL2L1, CDKN2C, FAAP24, FEN1, MAP3K1, MAPK13, MAPK3, NFKB1, NFKB2, SLC22A5, SLC31A2, TLR4, TWIST1; carboplatin: AKT1, EIF3K, ERCC1, GNGT1, GSR, MTHFR, NEDD4L, NLRP1, NRAS, RAF1, SGK1, TIGD1, TP53, VEGFB, VEGFC; oxaliplatin: BRAF, FCGR2A, IGF1, MSH2, NAGK, NFE2L2, NQO1, PANK3, SLC47A1, SLCO1B1, UGT1A1. TCGA bladder, ovarian and colorectal cancer patients were used to test cisplatin, carboplatin and oxaliplatin signatures (respectively), resulting in 71.0%, 60.2% and 54.5% accuracy in predicting disease recurrence and 59%, 61% and 72% accuracy in predicting remission. One cisplatin signature predicted 100% of recurrence in non-smoking bladder cancer patients (57% disease-free; N=19), and 79% recurrence in smokers (62% disease-free; N=35). This approach should be adaptable to other studies of chemotherapy response, independent of drug or cancer types.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.