Gene-by-environment (GxE) interactions determine common disease risk factors and biomedically relevant complex traits. However, quantifying how the environment modulates genetic effects on human quantitative phenotypes presents unique challenges. Environmental covariates are complex and difficult to measure and control at the organismal level, as found in GWAS and epidemiological studies. An alternative approach focuses on the cellular environment using in vitro treatments as a proxy for the organismal environment. These cellular environments simplify the organism-level environmental exposures to provide a tractable influence on subcellular phenotypes, such as gene expression. Expression quantitative trait loci (eQTL) mapping studies identified GxE interactions in response to drug treatment and pathogen exposure. However, eQTL mapping approaches are infeasible for large-scale analysis of multiple cellular environments. Recently, allele-specific expression (ASE) analysis emerged as a powerful tool to identify GxE interactions in gene expression patterns by exploiting naturally occurring environmental exposures. Here we characterized genetic effects on the transcriptional response to 50 treatments in five cell types. We discovered 1455 genes with ASE (FDR < 10%) and 215 genes with GxE interactions. We demonstrated a major role for GxE interactions in complex traits. Genes with a transcriptional response to environmental perturbations showed sevenfold higher odds of being found in GWAS. Additionally, 105 genes that indicated GxE interactions (49%) were identified by GWAS as associated with complex traits. Examples include GIPR-caffeine interaction and obesity and include LAMP3-selenium interaction and Parkinson disease. Our results demonstrate that comprehensive catalogs of GxE interactions are indispensable to thoroughly annotate genes and bridge epidemiological and genomewide association studies.
Cobalamin (vitamin B12) is a complex metabolite and essential cofactor required by many branches of life, including most eukaryotic phytoplankton. Algae and other cobalamin auxotrophs rely on environmental cobalamin supplied from a relatively small set of cobalamin-producing prokaryotic taxa. Although several Bacteria have been implicated in cobalamin biosynthesis and associated with algal symbiosis, the involvement of Archaea in cobalamin production is poorly understood, especially with respect to the Thaumarchaeota. Based on the detection of cobalamin synthesis genes in available thaumarchaeotal genomes, we hypothesized that Thaumarchaeota, which are ubiquitous and abundant in aquatic environments, have an important role in cobalamin biosynthesis within global aquatic ecosystems. To test this hypothesis, we examined cobalamin synthesis genes across sequenced thaumarchaeotal genomes and 430 metagenomes from a diverse range of marine, freshwater and hypersaline environments. Our analysis demonstrates that all available thaumarchaeotal genomes possess cobalamin synthesis genes, predominantly from the anaerobic pathway, suggesting widespread genetic capacity for cobalamin synthesis. Furthermore, although bacterial cobalamin genes dominated most surface marine metagenomes, thaumarchaeotal cobalamin genes dominated metagenomes from polar marine environments, increased with depth in marine water columns, and displayed seasonality, with increased winter abundance observed in time-series datasets (e.g., L4 surface water in the English Channel). Our results also suggest niche partitioning between thaumarchaeotal and cyanobacterial ribosomal and cobalamin synthesis genes across all metagenomic datasets analyzed. These results provide strong evidence for specific biogeographical distributions of thaumarchaeotal cobalamin genes, expanding our understanding of the global biogeochemical roles played by Thaumarchaeota in aquatic environments.
BackgroundMetagenomes provide access to the taxonomic composition and functional capabilities of microbial communities. Although metagenomic analysis methods exist for estimating overall community composition or metabolic potential, identifying specific taxa that encode specific functions or pathways of interest can be more challenging. Here we present MetAnnotate, which addresses the common question: “which organisms perform my function of interest within my metagenome(s) of interest?” MetAnnotate uses profile hidden Markov models to analyze shotgun metagenomes for genes and pathways of interest, classifies retrieved sequences either through a phylogenetic placement or best hit approach, and enables comparison of these profiles between metagenomes.ResultsBased on a simulated metagenome dataset, the tool achieves high taxonomic classification accuracy for a broad range of genes, including both markers of community abundance and specific biological pathways. Lastly, we demonstrate MetAnnotate by analyzing for cobalamin (vitamin B12) synthesis genes across hundreds of aquatic metagenomes in a fraction of the time required by the commonly used Basic Local Alignment Search Tool top hit approach.ConclusionsMetAnnotate is multi-threaded and installable as a local web application or command-line tool on Linux systems. Metannotate is a useful framework for general and/or function-specific taxonomic profiling and comparison of metagenomes.Electronic supplementary materialThe online version of this article (doi:10.1186/s12915-015-0195-4) contains supplementary material, which is available to authorized users.
Children who developed high myopia during 7 years of follow-up were younger and had more myopia at baseline. They also were more likely to have two myopic parents. These children may be at greater risk for sight-threatening conditions later in life.
Predicted open reading frames (ORFs) that lack detectable homology to known proteins are termed ORFans. Despite their prevalence in metagenomes, the extent to which ORFans encode real proteins, the degree to which they can be annotated, and their functional contributions, remain unclear. To gain insights into these questions, we applied sensitive remote-homology detection methods to functionally analyze ORFans from soil, marine, and human gut metagenome collections. ORFans were identified, clustered into sequence families, and annotated through profile-profile comparison to proteins of known structure. We found that a considerable number of metagenomic ORFans (73,896 of 484,121, 15.3%) exhibit significant remote homology to structurally characterized proteins, providing a means for ORFan functional profiling. The extent of detected remote homology far exceeds that obtained for artificial protein families (1.4%). As expected for real genes, the predicted functions of ORFans are significantly similar to the functions of their gene neighbors (p < 0.001). Compared to the functional profiles predicted through standard homology searches, ORFans show biologically intriguing differences. Many ORFan-enriched functions are virus-related and tend to reflect biological processes associated with extreme sequence diversity. Each environment also possesses a large number of unique ORFan families and functions, including some known to play important community roles such as gut microbial polysaccharide digestion. Lastly, ORFans are a valuable resource for finding novel enzymes of interest, as we demonstrate through the identification of hundreds of novel ORFan metalloproteases that all possess a signature catalytic motif despite a general lack of similarity to known proteins. Our ORFan functional predictions are a valuable resource for discovering novel protein families and exploring the boundaries of protein sequence space. All remote homology predictions are available at http://doxey.uwaterloo.ca/ORFans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.