One major task in molecular biology is to understand the dependency among genes to model gene regulatory networks. Pearson's correlation is the most common method used to measure dependence between gene expression signals, but it works well only when data are linearly associated. For other types of association, such as non-linear or non-functional relationships, methods based on the concepts of rank correlation and information theory-based measures are more adequate than the Pearson's correlation, but are less used in applications, most probably because of a lack of clear guidelines for their use. This work seeks to summarize the main methods (Pearson's, Spearman's and Kendall's correlations; distance correlation; Hoeffding's D: measure; Heller-Heller-Gorfine measure; mutual information and maximal information coefficient) used to identify dependency between random variables, especially gene expression data, and also to evaluate the strengths and limitations of each method. Systematic Monte Carlo simulation analyses ranging from sample size, local dependence and linear/non-linear and also non-functional relationships are shown. Moreover, comparisons in actual gene expression data are carried out. Finally, we provide a suggestive list of methods that can be used for each type of data set.
The study of interactions among biological components can be carried out by using methods grounded on network theory. Most of these methods focus on the comparison of two biological networks (e.g., control vs. disease). However, biological systems often present more than two biological states (e.g., tumor grades). To compare two or more networks simultaneously, we developed , a package with a user-friendly graphical interface. compares correlation networks based on the probability distribution of a feature of the graph (e.g., centrality measures). The analysis of the structural alterations on the network reveals significant modifications in the system. For example, the analysis of centrality measures provides information about how the relevance of the nodes changes among the biological states. We evaluated the performance of in both, toy models and two case studies. The latter related to gene expression of tumor cells and plant metabolism. Results based on simulated scenarios suggest that the statistical power of is less sensitive to the increase of the number of networks than Gene Set Coexpression Analysis ( ). Also, besides being able to identify nodes with modified centralities, identified altered networks associated with signaling pathways that were not identified by other methods.
Recent studies have suggested abnormal brain network organization in subjects with Autism Spectrum Disorders (ASD). Here we applied spectral clustering algorithm, diverse centrality measures (betweenness (BC), clustering (CC), eigenvector (EC), and degree (DC)), and also the network entropy (NE) to identify brain sub-systems associated with ASD. We have found that BC increases in the following ASD clusters: in the somatomotor, default-mode, cerebellar, and fronto-parietal. On the other hand, CC, EC, and DC decrease in the somatomotor, default-mode, and cerebellar clusters. Additionally, NE decreases in ASD in the cerebellar cluster. These findings reinforce the hypothesis of under-connectivity in ASD and suggest that the difference in the network organization is more prominent in the cerebellar system. The cerebellar cluster presents reduced NE in ASD, which relates to a more regular organization of the networks. These results might be important to improve current understanding about the etiological processes and the development of potential tools supporting diagnosis and therapeutic interventions.
Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their “importance” in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.
Highlights d A matrix decomposition model for repurposing broadspectrum antivirals d A graph kernel approach to model perturbations induced by drugs on the interactome d Graph kernels can integrate transcriptomics data to improve drug repurposing d CoREx: a free online tool to formulate hypothesis for drug repurposing for COVID-19
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.