Understanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h2 can be relatively well characterized with 59% of expressed genes showing significant h2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h2. Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).
Bacterial viruses (bacteriophages) play a significant role in microbial community dynamics. Within the human gastrointestinal tract, for instance, associations among bacteriophages (phages), microbiota stability, and human health have been discovered. In contrast to the gastrointestinal tract, the phages associated with the urinary microbiota are largely unknown. Preliminary metagenomic surveys of the urinary virome indicate a rich diversity of novel lytic phage sequences at an abundance far outnumbering that of eukaryotic viruses. These surveys, however, exclude the lysogenic phages residing within the bacteria of the bladder. To characterize this phage population, we examined 181 genomes representative of the phylogenetic diversity of bacterial species within the female urinary microbiota and found 457 phage sequences, 226 of which were predicted with high confidence. Phages were prevalent within the bladder bacteria: 86% of the genomes examined contained at least one phage sequence. Most of these phages are novel, exhibiting no discernible sequence homology to sequences in public data repositories. The presence of phages with substantial sequence similarity within the microbiota of different women supports the existence of a core community of phages within the bladder. Furthermore, the observed variation between the phage populations of women with and without overactive bladder symptoms suggests that phages may contribute to urinary health. To complement our bioinformatic analyses, viable phages were cultivated from the bacterial isolates for characterization; a novel coliphage was isolated, which is obligately lytic in the laboratory strain C. Sequencing of bacterial genomes facilitates a comprehensive cataloguing of the urinary virome and reveals phage-host interactions. Bacteriophages are abundant within the human body. However, while some niches have been well surveyed, the phage population within the urinary microbiome is largely unknown. Our study is the first survey of the lysogenic phage population within the urinary microbiota. Most notably, the abundance of prophage exceeds that of the bacteria. Furthermore, many of the prophage sequences identified exhibited no recognizable sequence homology to sequences in data repositories. This suggests a rich diversity of uncharacterized phage species present in the bladder. Additionally, we observed a variation in the abundances of phages between bacteria isolated from asymptomatic "healthy" individuals and those with urinary symptoms, thus suggesting that, like phages within the gut, phages within the bladder may contribute to urinary health.
Understanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h 2 can be relatively well characterized with 59% of expressed genes showing significant h 2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h 2 . Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R 2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).
BackgroundThe persistent decrease in cost and difficulty of whole genome sequencing of microbial organisms has led to a dramatic increase in the number of species and strains characterized from a wide variety of environments. Microbial genome sequencing can now be conducted by small laboratories and as part of undergraduate curriculum. While sequencing is routine in microbiology, assembly, annotation and downstream analyses still require computational resources and expertise, often necessitating familiarity with programming languages. To address this problem, we have created a light-weight, user-friendly tool for the assembly and annotation of microbial sequencing projects.ResultsThe Prokaryotic Assembly and Annotation Tool, Peasant, automates the processes of read quality control, genome assembly, and annotation for microbial sequencing projects. High-quality assemblies and annotations can be generated by Peasant without the need of programming expertise or high-performance computing resources. Furthermore, statistics are calculated so that users can evaluate their sequencing project. To illustrate the computational speed and accuracy of Peasant, the SRA records of 322 Illumina platform whole genome sequencing assays for Bacillus species were retrieved from NCBI, assembled and annotated on a single desktop computer. From the assemblies and annotations produced, a comprehensive analysis of the diversity of over 200 high-quality samples was conducted, looking at both the 16S rRNA phylogenetic marker as well as the Bacillus core genome.ConclusionsPeasant provides an intuitive solution for high-quality whole genome sequence assembly and annotation for users with limited programing experience and/or computational resources. The analysis of the Bacillus whole genome sequencing projects exemplifies the utility of this tool. Furthermore, the study conducted here provides insight into the diversity of the species, the largest such comparison conducted to date.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.