Background: Small cell lung cancer (SCLC) is an aggressive neuroendocrine lung cancer. SCLC progression and treatment resistance involve epigenetic processes. However, links between SCLC DNA methylation and drug response remain unclear. We performed an epigenome-wide study of 66 human SCLC cell lines using the Illumina Infinium MethylationEPIC BeadChip array. Correlations of SCLC DNA methylation and gene expression with in vitro response to 526 antitumor agents were examined. Results: We found multiple significant correlations between DNA methylation and chemosensitivity. A potentially important association was observed for TREX1, which encodes the 3′ exonuclease I that serves as a STING antagonist in the regulation of a cytosolic DNA-sensing pathway. Increased methylation and low expression of TREX1 were associated with the sensitivity to Aurora kinase inhibitors AZD-1152, SCH-1473759, SNS-314, and TAK-901; the CDK inhibitor R-547; the Vertex ATR inhibitor Cpd 45; and the mitotic spindle disruptor vinorelbine. Compared with cell lines of other cancer types, TREX1 had low mRNA expression and increased upstream region methylation in SCLC, suggesting a possible relationship with SCLC sensitivity to Aurora kinase inhibitors. We also identified multiple additional correlations indicative of potential mechanisms of chemosensitivity. Methylation of the 3′UTR of CEP350 and MLPH, involved in centrosome machinery and microtubule tracking, respectively, was associated with response to Aurora kinase inhibitors and other agents. EPAS1 methylation was associated with response to Aurora kinase inhibitors, a PLK-1 inhibitor and a Bcl-2 inhibitor. KDM1A methylation was associated with PLK-1 inhibitors and a KSP inhibitor. Increased promoter methylation of SLFN11 was correlated with resistance to DNA damaging agents, as a result of low or no SLFN11 expression. The 5′ UTR of the epigenetic modifier EZH2 was associated with response to Aurora kinase inhibitors and a FGFR inhibitor. Methylation and expression of YAP1 were correlated with response to an mTOR inhibitor. Among nonneuroendocrine markers, EPHA2 was associated with response to Aurora kinase inhibitors and a PLK-1 inhibitor and CD151 with Bcl-2 inhibitors.
BackgroundThe high degree of heterogeneity observed in breast cancers makes it very difficult to classify the cancer patients into distinct clinical subgroups and consequently limits the ability to devise effective therapeutic strategies. Several classification strategies based on ER/PR/HER2 expression or the expression profiles of a panel of genes have helped, but such methods often produce misleading results due to their dynamic nature. In contrast, somatic DNA mutations are relatively stable and lead to initiation and progression of many sporadic cancers. Hence in this study, we explore the use of gene mutation profiles to classify, characterize and predict the subgroups of breast cancers.ResultsWe analyzed the whole exome sequencing data from 358 ethnically similar breast cancer patients in The Cancer Genome Atlas (TCGA) project. Somatic and non-synonymous single nucleotide variants identified from each patient were assigned a quantitative score (C-score) that represents the extent of negative impact on the gene function. Using these scores with non-negative matrix factorization method, we clustered the patients into three subgroups. By comparing the clinical stage of patients, we identified an early-stage-enriched and a late-stage-enriched subgroup. Comparison of the mutation scores of early and late-stage-enriched subgroups identified 358 genes that carry significantly higher mutations rates in the late stage subgroup. Functional characterization of these genes revealed important functional gene families that carry a heavy mutational load in the late state rich subgroup of patients. Finally, using the identified subgroups, we also developed a supervised classification model to predict the stage of the patients.ConclusionsThis study demonstrates that gene mutation profiles can be effectively used with unsupervised machine-learning methods to identify clinically distinguishable breast cancer subgroups. The classification model developed in this method could provide a reasonable prediction of the cancer patients’ stage solely based on their mutation profiles. This study represents the first use of only somatic mutation profile data to identify and predict breast cancer subgroups and this generic methodology can also be applied to other cancer datasets.Electronic supplementary materialThe online version of this article (doi:10.1186/s12918-016-0306-z) contains supplementary material, which is available to authorized users.
BackgroundUnderstanding protein subcellular localization is a necessary component toward understanding the overall function of a protein. Numerous computational methods have been published over the past decade, with varying degrees of success. Despite the large number of published methods in this area, only a small fraction of them are available for researchers to use in their own studies. Of those that are available, many are limited by predicting only a small number of organelles in the cell. Additionally, the majority of methods predict only a single location for a sequence, even though it is known that a large fraction of the proteins in eukaryotic species shuttle between locations to carry out their function.FindingsWe present a software package and a web server for predicting the subcellular localization of protein sequences based on the ngLOC method. ngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 89.8% to 91.4% across species. This program can predict 11 distinct locations each in plant and animal species. ngLOC also predicts 4 and 5 distinct locations on gram-positive and gram-negative bacterial datasets, respectively.ConclusionsngLOC is a generic method that can be trained by data from a variety of species or classes for predicting protein subcellular localization. The standalone software is freely available for academic use under GNU GPL, and the ngLOC web server is also accessible at http://ngloc.unmc.edu.
BackgroundIn protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class.ResultsWe present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks.ConclusionThe proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.
Background Altered DNA methylation patterns play important roles in cancer development and progression. We examined whether expression levels of genes directly or indirectly involved in DNA methylation and demethylation may be associated with response of cancer cell lines to chemotherapy treatment with a variety of antitumor agents. Results We analyzed 72 genes encoding epigenetic factors directly or indirectly involved in DNA methylation and demethylation processes. We examined association of their pretreatment expression levels with methylation beta-values of individual DNA methylation probes, DNA methylation averaged within gene regions, and average epigenome-wide methylation levels. We analyzed data from 645 cancer cell lines and 23 cancer types from the Cancer Cell Line Encyclopedia and Genomics of Drug Sensitivity in Cancer datasets. We observed numerous correlations between expression of genes encoding epigenetic factors and response to chemotherapeutic agents. Expression of genes encoding a variety of epigenetic factors, including KDM2B, DNMT1, EHMT2, SETDB1, EZH2, APOBEC3G, and other genes, was correlated with response to multiple agents. DNA methylation of numerous target probes and gene regions was associated with expression of multiple genes encoding epigenetic factors, underscoring complex regulation of epigenome methylation by multiple intersecting molecular pathways. The genes whose expression was associated with methylation of multiple epigenome targets encode DNA methyltransferases, TET DNA methylcytosine dioxygenases, the methylated DNA-binding protein ZBTB38, KDM2B, SETDB1, and other molecular factors which are involved in diverse epigenetic processes affecting DNA methylation. While baseline DNA methylation of numerous epigenome targets was correlated with cell line response to antitumor agents, the complex relationships between the overlapping effects of each epigenetic factor on methylation of specific targets and the importance of such influences in tumor response to individual agents require further investigation. Conclusions Expression of multiple genes encoding epigenetic factors is associated with drug response and with DNA methylation of numerous epigenome targets that may affect response to therapeutic agents. Our findings suggest complex and interconnected pathways regulating DNA methylation in the epigenome, which may both directly and indirectly affect response to chemotherapy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.