A key goal of biomedical research is to elucidate the complex network of gene interactions underlying complex traits such as common human diseases. Here we detail a multistep procedure for identifying potential key drivers of complex traits that integrates DNA-variation and gene-expression data with other complex trait data in segregating mouse populations. Ordering gene expression traits relative to one another and relative to other complex traits is achieved by systematically testing whether variations in DNA that lead to variations in relative transcript abundances statistically support an independent, causative or reactive function relative to the complex traits under consideration. We show that this approach can predict transcriptional responses to single gene-perturbation experiments using gene-expression data in the context of a segregating mouse population. We also demonstrate the utility of this approach by identifying and experimentally validating the involvement of three new genes in susceptibility to obesity.In the past few years, gene-expression microarrays and other general molecular profiling technologies have been applied to a wide range of biological problems and have contributed to discoveries about the complex network of biochemical processes underlying living Correspondence should be addressed to E.E.S. (eric_schadt@merck.com). Note: Supplementary information is available on the Nature Genetics website. COMPETING INTERESTS STATEMENTThe authors declare that they have no competing financial interests. NIH Public Access Author ManuscriptNat Genet. Author manuscript; available in PMC 2010 March 18. Published in final edited form as:Nat Genet. 2005 July ; 37(7): 710-717. doi:10.1038/ng1589. NIH-PA Author ManuscriptNIH-PA Author Manuscript NIH-PA Author Manuscript systems 1 , common human diseases 2,3 and gene discovery and structure determination [4][5][6] . Microarrays have also helped to identify biomarkers 7 , disease subtypes 3,8,9 and mechanisms of toxicity 10 and, more recently, to elucidate the genetics of gene expression in human populations 11,12 and to reconstruct gene networks by integrating gene-expression and genetic data 13 . The use of molecular profiling technologies as tools to identify genes underlying common, polygenic diseases has been less successful. Hundreds or even thousands of genes whose expression changes are associated with disease traits have been identified, but determining which of the genes cause disease rather than respond to the disease state has proven difficult.Microarray data have recently been combined with other experimental approaches to facilitate identification of key mechanistic drivers of complex traits 3,[13][14][15][16][17] . One such technique involves treating relative transcript abundances as quantitative traits in segregating populations. In this method, chromosomal regions that control the level of expression of a particular gene are mapped as expression quantitative trait loci (eQTLs). Gene-expression QTLs that contain the gene encoding t...
Genetic variants that are associated with common human diseases do not lead directly to disease, but instead act on intermediate, molecular phenotypes that in turn induce changes in higher-order disease traits. Therefore, identifying the molecular phenotypes that vary in response to changes in DNA and that also associate with changes in disease traits has the potential to provide the functional information required to not only identify and validate the susceptibility genes that are directly affected by changes in DNA, but also to understand the molecular networks in which such genes operate and how changes in these networks lead to changes in disease traits. Toward that end, we profiled more than 39,000 transcripts and we genotyped 782,476 unique single nucleotide polymorphisms (SNPs) in more than 400 human liver samples to characterize the genetic architecture of gene expression in the human liver, a metabolically active tissue that is important in a number of common human diseases, including obesity, diabetes, and atherosclerosis. This genome-wide association study of gene expression resulted in the detection of more than 6,000 associations between SNP genotypes and liver gene expression traits, where many of the corresponding genes identified have already been implicated in a number of human diseases. The utility of these data for elucidating the causes of common human diseases is demonstrated by integrating them with genotypic and expression data from other human and mouse populations. This provides much-needed functional support for the candidate susceptibility genes being identified at a growing number of genetic loci that have been identified as key drivers of disease from genome-wide association studies of disease. By using an integrative genomics approach, we highlight how the gene RPS26 and not ERBB3 is supported by our data as the most likely susceptibility gene for a novel type 1 diabetes locus recently identified in a large-scale, genome-wide association study. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease and plasma low-density lipoprotein cholesterol levels in the process.
We report a comprehensive analysis of gene expression differences between sexes in multiple somatic tissues of 334 mice derived from an intercross between inbred mouse strains C57BL/6J and C3H/HeJ. The analysis of a large number of individuals provided the power to detect relatively small differences in expression between sexes, and the use of an intercross allowed analysis of the genetic control of sexually dimorphic gene expression. Microarray analysis of 23,574 transcripts revealed that the extent of sexual dimorphism in gene expression was much greater than previously recognized. Thus, thousands of genes showed sexual dimorphism in liver, adipose, and muscle, and hundreds of genes were sexually dimorphic in brain. These genes exhibited highly tissue-specific patterns of expression and were enriched for distinct pathways represented in the Gene Ontology database. They also showed evidence of chromosomal enrichment, not only on the sex chromosomes, but also on several autosomes. Genetic analyses provided evidence of the global regulation of subsets of the sexually dimorphic genes, as the transcript levels of a large number of these genes were controlled by several expression quantitative trait loci (eQTL) hotspots that exhibited tissue-specific control. Moreover, many tissue-specific transcription factor binding sites were found to be enriched in the sexually dimorphic genes.
Identifying variations in DNA that increase susceptibility to disease is one of the primary aims of genetic studies using a forward genetics approach. However, identification of disease-susceptibility genes by means of such studies provides limited functional information on how genes lead to disease. In fact, in most cases there is an absence of functional information altogether, preventing a definitive identification of the susceptibility gene or genes. Here we develop an alternative to the classic forward genetics approach for dissecting complex disease traits where, instead of identifying susceptibility genes directly affected by variations in DNA, we identify gene networks that are perturbed by susceptibility loci and that in turn lead to disease. Application of this method to liver and adipose gene expression data generated from a segregating mouse population results in the identification of a macrophage-enriched network supported as having a causal relationship with disease traits associated with metabolic syndrome. Three genes in this network, lipoprotein lipase (Lpl), lactamase beta (Lactb) and protein phosphatase 1-like (Ppm1l), are validated as previously unknown obesity genes, strengthening the association between this network and metabolic disease traits. Our analysis provides direct experimental support that complex traits such as obesity are emergent properties of molecular networks that are modulated by complex genetic loci and environmental factors.
Proteins circulating in the blood are critical for age-related disease processes; however, the serum proteome has remained largely unexplored. To this end, 4137 proteins covering most predicted extracellular proteins were measured in the serum of 5457 Icelanders over 65 years of age. Pairwise correlation between proteins as they varied across individuals revealed 27 different network modules of serum proteins, many of which were associated with cardiovascular and metabolic disease states, as well as overall survival. The protein modules were controlled by cis- and trans-acting genetic variants, which in many cases were also associated with complex disease. This revealed co-regulated groups of circulating proteins that incorporated regulatory control between tissues and demonstrated close relationships to past, current, and future disease states.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.