Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
Defining the contributions and interactions of paternal and maternal genomes during embryo development is critical to understand the fundamental processes involved in hybrid vigor, hybrid sterility, and reproductive isolation. To determine the parental contributions and their regulation during Arabidopsis embryogenesis, we combined deep-sequencing-based RNA profiling and genetic analyses. At the 2-4 cell stage there is a strong, genome-wide dominance of maternal transcripts, although transcripts are contributed by both parental genomes. At the globular stage the relative paternal contribution is higher, largely due to a gradual activation of the paternal genome. We identified two antagonistic maternal pathways that control these parental contributions. Paternal alleles are initially downregulated by the chromatin siRNA pathway, linked to DNA and histone methylation, whereas transcriptional activation requires maternal activity of the histone chaperone complex CAF1. Our results define maternal epigenetic pathways controlling the parental contributions in plant embryos, which are distinct from those regulating genomic imprinting.
The acquisition of distinct cell fates is central to the development of multicellular organisms and is largely mediated by gene expression patterns specific to individual cells and tissues. A spatially and temporally resolved analysis of gene expression facilitates the elucidation of transcriptional networks linked to cellular identity and function. We present an approach that allows cell type-specific transcriptional profiling of distinct target cells, which are rare and difficult to access, with unprecedented sensitivity and resolution. We combined laser-assisted microdissection (LAM), linear amplification starting from <1 ng of total RNA, and RNA-sequencing (RNA-Seq). As a model we used the central cell of the Arabidopsis thaliana female gametophyte, one of the female gametes harbored in the reproductive organs of the flower. We estimated the number of expressed genes to be more than twice the number reported previously in a study using LAM and ATH1 microarrays, and identified several classes of genes that were systematically underrepresented in the transcriptome measured with the ATH1 microarray. Among them are many genes that are likely to be important for developmental processes and specific cellular functions. In addition, we identified several intergenic regions, which are likely to be transcribed, and describe a considerable fraction of reads mapping to introns and regions flanking annotated loci, which may represent alternative transcript isoforms. Finally, we performed a de novo assembly of the transcriptome and show that the method is suitable for studying individual cell types of organisms lacking reference sequence information, demonstrating that this approach can be applied to most eukaryotic organisms.
The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively.
ObjectiveAn inadequate host response to the intestinal microbiota likely contributes to the manifestation and progression of human inflammatory bowel disease (IBD). However, molecular approaches to unravelling the nature of the defective crosstalk and its consequences for intestinal metabolic and immunological networks are lacking. We assessed the mucosal transcript levels, splicing architecture and mucosa-attached microbial communities of patients with IBD to obtain a comprehensive view of the underlying, hitherto poorly characterised interactions, and how these are altered in IBD.DesignMucosal biopsies from Crohn's disease and patients with UC, disease controls and healthy individuals (n=63) were subjected to microbiome, transcriptome and splicing analysis, employing next-generation sequencing. The three data levels were integrated by different bioinformatic approaches, including systems biology-inspired network and pathway analysis.ResultsMicrobiota, host transcript levels and host splicing patterns were influenced most strongly by tissue differences, followed by the effect of inflammation. Both factors point towards a substantial disease-related alteration of metabolic processes. We also observed a strong enrichment of splicing events in inflamed tissues, accompanied by an alteration of the mucosa-attached bacterial taxa. Finally, we noted a striking uncoupling of the three molecular entities when moving from healthy individuals via disease controls to patients with IBD.ConclusionsOur results provide strong evidence that the interplay between microbiome and host transcriptome, which normally characterises a state of intestinal homeostasis, is drastically perturbed in Crohn's disease and UC. Consequently, integrating multiple OMICs levels appears to be a promising approach to further disentangle the complexity of IBD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.