Statistical methods for enrichment analysis are important tools to extract biological information from omics experiments. Although these methods have been widely used for the analysis of gene and protein lists, the development of high-throughput technologies for regulatory elements demands dedicated statistical and bioinformatics tools. Here, we present a set of enrichment analysis methods for regulatory elements, including CpG sites, miRNAs, and transcription factors. Statistical significance is determined via a power weighting function for target genes and tested by the Wallenius noncentral hypergeometric distribution model to avoid selection bias. These new methodologies have been applied to the analysis of a set of miRNAs associated with arrhythmia, showing the potential of this tool to extract biological information from a list of regulatory elements. These new methods are available in GeneCodis 4, a web tool able to perform singular and modular enrichment analysis that allows the integration of heterogeneous information.
The increasing use of high-throughput gene expression quantification technologies over the last two decades and the fact that most of the published studies are stored in public databases has triggered an explosion of studies available through public repositories. All this information offers an invaluable resource for reuse to generate new knowledge and scientific findings. In this context, great interest has been focused on meta-analysis methods to integrate and jointly analyze different gene expression datasets. In this work, we describe the main steps in the gene expression meta-analysis, from data preparation to the state-of-the art statistical methods. We also analyze the main types of applications and problems that can be approached in gene expression meta-analysis studies and provide a comparative overview of the available software and bioinformatics tools. Moreover, a practical guide for choosing the most appropriate method in each case is also provided.
The coronavirus disease 2019 (COVID-19) pandemic has caused an unprecedented global health crisis, with several countries imposing lockdowns to control the coronavirus spread. Important research efforts are focused on evaluating the association of environmental factors with the survival and spread of the virus and different works have been published, with contradictory results in some cases. Data with spatial and temporal information is a key factor to get reliable results and, although there are some data repositories for monitoring the disease both globally and locally, an application that integrates and aggregates data from meteorological and air quality variables with COVID-19 information has not been described so far to the best of our knowledge. Here, we present DatAC (Data Against COVID-19), a data fusion project with an interactive web frontend that integrates COVID-19 and environmental data in Spain. DatAC is provided with powerful data analysis and statistical capabilities that allow users to explore and analyze individual trends and associations among the provided data. Using the application, we have evaluated the impact of the Spanish lockdown on the air quality, observing that NO 2 , CO, PM 2.5 , PM 10 and SO 2 levels decreased drastically in the entire territory, while O 3 levels increased. We observed similar trends in urban and rural areas, although the impact has been more important in the former. Moreover, the application allowed us to analyze correlations among climate factors, such as ambient temperature, and the incidence of COVID-19 in Spain. Our results indicate that temperature is not the driving factor and without effective control actions, outbreaks will appear and warm weather will not substantially limit the growth of the pandemic. DatAC is available at https://covid19.genyo.es .
Systemic lupus erythematosus (SLE) is a heterogeneous disease with unpredictable patterns of activity. Patients with similar activity levels may have different prognosis and molecular abnormalities. In this study, we aimed to measure the main differences in drug-induced gene expression signatures across SLE patients and to evaluate the potential for clinical data to build a machine learning classifier able to predict the SLE subset for individual patients. SLE transcriptomic data from two cohorts were compared with drug-induced gene signatures from the CLUE database to compute a connectivity score that reflects the capability of a drug to revert the patient signatures. Patient stratification based on drug connectivity scores revealed robust clusters of SLE patients identical to the clusters previously obtained through longitudinal gene expression data, implying that differential treatment depends on the cluster to which patients belongs. The best drug candidates found, mTOR inhibitors or those reducing oxidative stress, showed stronger cluster specificity. We report that drug patterns for reverting disease gene expression follow the cell-specificity of the disease clusters. We used 2 cohorts to train and test a logistic regression model that we employed to classify patients from 3 independent cohorts into the SLE subsets and provide a clinically useful model to predict subset assignment and drug efficacy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.