Gene co-expression networks can be used to associate genes of unknown function with biological processes, to prioritize candidate disease genes or to discern transcriptional regulatory programmes. With recent advances in transcriptomics and next-generation sequencing, co-expression networks constructed from RNA sequencing data also enable the inference of functions and disease associations for non-coding genes and splice variants. Although gene co-expression networks typically do not provide information about causality, emerging methods for differential co-expression analysis are enabling the identification of regulatory genes underlying various phenotypes. Here, we introduce and guide researchers through a (differential) co-expression analysis. We provide an overview of methods and tools used to create and analyse co-expression networks constructed from gene expression data, and we explain how these can be used to identify genes with a regulatory role in disease. Furthermore, we discuss the integration of other data types with co-expression networks and offer future perspectives of co-expression analysis.
Stochastic changes in cytosine methylation are a source of heritable epigenetic and phenotypic diversity in plants. Using the model plant Arabidopsis thaliana, we derive robust estimates of the rate at which methylation is spontaneously gained (forward epimutation) or lost (backward epimutation) at individual cytosines and construct a comprehensive picture of the epimutation landscape in this species. We demonstrate that the dynamic interplay between forward and backward epimutations is modulated by genomic context and show that subtle contextual differences have profoundly shaped patterns of methylation diversity in A. thaliana natural populations over evolutionary timescales. Theoretical arguments indicate that the epimutation rates reported here are high enough to rapidly uncouple genetic from epigenetic variation, but low enough for new epialleles to sustain long-term selection responses. Our results provide new insights into methylome evolution and its population-level consequences.epigenetics | epimutation | DNA methylation | evolution | Arabidopsis P lant genomes make extensive use of cytosine methylation to control the expression of transposable elements (TEs) and genes (1). Despite its tight regulation, methylation losses or gains at individual cytosines or clusters of cytosines can emerge spontaneously, in an event termed "epimutation" (2, 3). Many examples of segregating epimutations have been documented in experimental and wild populations of plants and in some cases contribute to heritable variation in phenotypes independently of DNA sequence variation (4, 5). These observations have led to much speculation about the role of DNA methylation in plant evolution (6-8), and its potential in breeding programs (9). In the model plant Arabidopsis thaliana, spontaneous methylation changes at CG dinucleotides accumulate in a rapid but nonlinear fashion over generations (2,3,10), thus pointing to high forward-backward epimutation rates (11). Precise estimates of these rates are necessary to be able to quantify the long-term dynamics of epigenetic variation under laboratory or natural conditions, and to understand the molecular mechanisms that drive methylome evolution (12-14). Here we combine theoretical modeling with high-resolution methylome analysis of multiple independent A. thaliana mutation accumulation (MA) lines (15), including measurements of methylation changes in continuous generations, to obtain robust estimates of forward and backward epimutation rates. ResultsWe joined whole-genome MethylC-seq (16) data from two earlier MA studies (2, 3) with extensive multigenerational MethylC-seq measurements from three additional MA lines (Fig. 1A and SI Appendix, Tables S1-S6). The first of these new MA lines (MA1 3) was propagated for 30 generations and includes measurements for 13 (nearly) consecutive generations (Fig. 1A). The other two MA lines (MA2 3) were propagated for 17 generations and were measured every four generations on average (Fig. 1A). These new data therefore allowed us to track epimutation...
statistics from other independent studies 13,17,18 , we identify novel host-microbiota interactions. Furthermore, we explore the impact of potential confounding factors in modulating these genetic effects and identify potential diet-dependent host-microbiota interactions. We further assess the potential causal relationships between the gut microbiome and dietary habits, biomarkers and disease using Mendelian randomization (MR). Finally, we carry out a power analysis showing how microbiome studies, even at the current sample size, are underpowered to reveal the complex genetic architecture by which host genetics regulates the gut microbiome. ResultsGenome-wide associations with bacterial taxa and pathways. We investigated 5.5 million common (minor allele frequency (MAF) > 0.05) genetic variants on all autosomes and the X chromosome using linear mixed models 19 to test their association with 207 taxa and 205 bacterial pathways in 7,738 individuals from the DMP cohort (Methods and Supplementary Table 1) 19 . There was no evidence for test statistic inflation (median genomic lambda 1.002 (range, 0.75-1.03) for taxa and 1.004 (range, 0.87-1.04) for pathways). We identified 37 single nucleotide polymorphism (SNP)trait associations at 24 independent loci at a genome-wide P value threshold of 5 × 10 −8 (Fig. 1 and Supplementary Table 2). Genetic variants at two loci passed the more stringent study-wide threshold of 1.89 × 10 −10 that accounts for the number of independent tests performed (Methods).The strongest signal was seen for rs182549 located in an intron of MCM6, a perfect proxy of rs4988235 (r 2 = 1, 1000 Genomes Project European populations), one of the variants known to regulate the LCT gene and responsible for lactase persistence in adults (ClinVar accession RCV000008124). The T allele of rs182549, which confers lactase persistence through a dominant model of inheritance, was found to be associated with decreased abundances of the species Bifidobacterium adolescentis (P = 7.6 × 10 −14 ) and Bifidobacterium longum (P = 3.2 × 10 −08 ), as well as decreased abundances of higher-level taxa (Supplementary Table 2 (ref. 5 )). Associations at this locus were also seen for other taxa of the same genus but at lower levels of significance (Bifidobacterium catenulatum, P = 3.9 × 10 −5 ) and for species of the Collinsella genus (Extended Data Fig. 1). The genetic association at the LCT locus has been previously described, albeit only at the genus level, in Dutch, UK and US cohorts 6,8,14 , as well as in a recent large-scale meta-analysis 13 .The second locus that passed study-wide significance consisted of genetic variants near the ABO gene. ABO encodes the BGAT protein, a histo-blood group ABO system transferase. Associations found at this locus include species Bifidobacterium bifidum (rs8176645, p = 5.5 × 10 −15 ) and Collinsella aerofaciens (rs550057, P = 2.0 × 10 −8 , r 2 = 0.59 with rs8176645 in 1000 Genomes Project Europeans) and higher-order taxa (rs550057, genus Collinsella, P = 9.3 × 10 −11 ; family Coriobacteriac...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.