BackgroundMetastasis is a major cancer-related cause of death. Recent studies have described metastasis pathways. However, the exact contribution of each pathway remains unclear. Another key feature of a tumor is the presence of hypoxic areas caused by a lack of oxygen at the center of the tumor. Hypoxia leads to the expression of pro-metastatic genes as well as the repression of anti-metastatic genes. As many Affymetrix datasets about metastasis and hypoxia are publicly available and not fully exploited, this study proposes to re-analyze these datasets to extract new information about the metastatic phenotype induced by hypoxia in different cancer cell lines.MethodsAffymetrix datasets about metastasis and/or hypoxia were downloaded from GEO and ArrayExpress. AffyProbeMiner and GCRMA packages were used for pre-processing and the Window Welch t test was used for processing. Three approaches of meta-analysis were eventually used for the selection of genes of interest.ResultsThree complementary approaches were used, that eventually selected 183 genes of interest. Out of these 183 genes, 99, among which the well known JUNB, FOS and TP63, have already been described in the literature to be involved in cancer. Moreover, 39 genes of those, such as SERPINE1 and MMP7, are known to regulate metastasis. Twenty-one genes including VEGFA and ID2 have also been described to be involved in the response to hypoxia. Lastly, DAVID classified those 183 genes in 24 different pathways, among which 8 are directly related to cancer while 5 others are related to proliferation and cell motility. A negative control composed of 183 random genes failed to provide such results. Interestingly, 6 pathways retrieved by DAVID with the 183 genes of interest concern pathogen recognition and phagocytosis.ConclusionThe proposed methodology was able to find genes actually known to be involved in cancer, metastasis and hypoxia and, thus, we propose that the other genes selected based on the same methodology are of prime interest in the metastatic phenotype induced by hypoxia.
This work focuses on differential expression analysis of microarray datasets. One way to improve such statistical analyses is to integrate biological information in the design of these analyses. In this paper, we will use the relationship between the level of gene expression and variability. Using this biological information, we propose to integrate the information from multiple genes to get a better estimate of individual gene variance, when a small number of replicates are available, to increase the power of the statistical analysis. We describe a strategy named the "Window t test" that uses multiple genes which share a similar expression level to compute the variance which is then incorporated a classic t test. The performances of this new method are evaluated by comparison with classic and widely-used methods for differential expression analysis (the classic Student t test, the Regularized t test (reg t test), SAM, Limma, LPE and Shrinkage t). In each case tested, the results obtained were at least equivalent to the best performing method and, in most cases, outperformed it. Moreover, the Window t test relies on a very simple procedure requiring small computing power compared with other methods designed for microarray differential expression analysis.
BackgroundRecent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods.ResultsOur novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality.ConclusionsPerformance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better.AvailabilityThe R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
Sex steroids play a key role in triggering sex differentiation in fish, the use of exogenous hormone treatment leading to partial or complete sex reversal. This phenomenon has attracted attention since the discovery that even low environmental doses of exogenous steroids can adversely affect gonad morphology (ovotestis development) and induce reproductive failure. Modern genomic-based technologies have enhanced opportunities to find out mechanisms of actions (MOA) and identify biomarkers related to the toxic action of a compound. However, high throughput data interpretation relies on statistical analysis, species genomic resources, and bioinformatics tools. The goals of this study are to improve the knowledge of feminisation in fish, by the analysis of molecular responses in the gonads of rainbow trout fry after chronic exposure to several doses (0.01, 0.1, 1 and 10 μg/L) of ethynylestradiol (EE2) and to offer target genes as potential biomarkers of ovotestis development. We successfully adapted a bioinformatics microarray analysis workflow elaborated on human data to a toxicogenomic study using rainbow trout, a fish species lacking accurate functional annotation and genomic resources. The workflow allowed to obtain lists of genes supposed to be enriched in true positive differentially expressed genes (DEGs), which were subjected to over-representation analysis methods (ORA). Several pathways and ontologies, mostly related to cell division and metabolism, sexual reproduction and steroid production, were found significantly enriched in our analyses. Moreover, two sets of potential ovotestis biomarkers were selected using several criteria. The first group displayed specific potential biomarkers belonging to pathways/ontologies highlighted in the experiment. Among them, the early ovarian differentiation gene foxl2a was overexpressed. The second group, which was highly sensitive but not specific, included the DEGs presenting the highest fold change and lowest p-value of the statistical workflow output. The methodology can be generalized to other (non-model) species and various types of microarray platforms.
BackgroundMicroarray data is frequently used to characterize the expression profile of a whole genome and to compare the characteristics of that genome under several conditions. Geneset analysis methods have been described previously to analyze the expression values of several genes related by known biological criteria (metabolic pathway, pathology signature, co-regulation by a common factor, etc.) at the same time and the cost of these methods allows for the use of more values to help discover the underlying biological mechanisms.ResultsAs several methods assume different null hypotheses, we propose to reformulate the main question that biologists seek to answer. To determine which genesets are associated with expression values that differ between two experiments, we focused on three ad hoc criteria: expression levels, the direction of individual gene expression changes (up or down regulation), and correlations between genes. We introduce the FAERI methodology, tailored from a two-way ANOVA to examine these criteria. The significance of the results was evaluated according to the self-contained null hypothesis, using label sampling or by inferring the null distribution from normally distributed random data. Evaluations performed on simulated data revealed that FAERI outperforms currently available methods for each type of set tested. We then applied the FAERI method to analyze three real-world datasets on hypoxia response. FAERI was able to detect more genesets than other methodologies, and the genesets selected were coherent with current knowledge of cellular response to hypoxia. Moreover, the genesets selected by FAERI were confirmed when the analysis was repeated on two additional related datasets.ConclusionsThe expression values of genesets are associated with several biological effects. The underlying mathematical structure of the genesets allows for analysis of data from several genes at the same time. Focusing on expression levels, the direction of the expression changes, and correlations, we showed that two-step data reduction allowed us to significantly improve the performance of geneset analysis using a modified two-way ANOVA procedure, and to detect genesets that current methods fail to detect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.