For most human complex diseases and traits, SNPs identified by genome-wide association studies (GWAS) explain only a small fraction of the heritability. Here we report a user-friendly software tool called genome-wide complex trait analysis (GCTA), which was developed based on a method we recently developed to address the "missing heritability" problem. GCTA estimates the variance explained by all the SNPs on a chromosome or on the whole genome for a complex trait rather than testing the association of any particular SNP to the trait. We introduce GCTA's five main functions: data management, estimation of the genetic relationships from SNPs, mixed linear model analysis of variance explained by the SNPs, estimation of the linkage disequilibrium structure, and GWAS simulation. We focus on the function of estimating the variance explained by all the SNPs on the X chromosome and testing the hypotheses of dosage compensation. The GCTA software is a versatile tool to estimate and partition complex trait variation with large GWAS data sets.
Schizophrenia is a heritable disorder with substantial public health
impact. We conducted a multi-stage genome-wide association study (GWAS) for
schizophrenia beginning with a Swedish national sample (5,001 cases, 6,243
controls) followed by meta-analysis with prior schizophrenia GWAS (8,832 cases,
12,067 controls) and finally by replication of SNPs in 168 genomic regions in
independent samples (7,413 cases, 19,762 controls, and 581 trios). In total, 22
regions met genome-wide significance (14 novel and one previously implicated in
bipolar disorder). The results strongly implicate calcium signaling in the
etiology of schizophrenia, and include genome-wide significant results for
CACNA1C and CACNB2 whose protein products
interact. We estimate that ∼8,300 independent and predominantly common
SNPs contribute to risk for schizophrenia and that these collectively account
for most of its heritability. Common genetic variation plays an important role
in the etiology of schizophrenia, and larger studies will allow more detailed
understanding of this devastating disorder.
Genome-wide association studies are designed to discover SNPs that are associated with a complex trait. Employing strict significance thresholds when testing individual SNPs avoids false positives at the expense of increasing false negatives. Recently, we developed a method for quantitative traits that estimates the variation accounted for when fitting all SNPs simultaneously. Here we develop this method further for case-control studies. We use a linear mixed model for analysis of binary traits and transform the estimates to a liability scale by adjusting both for scale and for ascertainment of the case samples. We show by theory and simulation that the method is unbiased. We apply the method to data from the Wellcome Trust Case Control Consortium and show that a substantial proportion of variation in liability for Crohn disease, bipolar disorder, and type I diabetes is tagged by common SNPs.
We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing (WGS) data. We demonstrate using simulations based on WGS data that ~97% and ~68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ~17M imputed variants explain 56% (s.e. = 2.3%) of variance for height and 27% (s.e. = 2.5%) for body mass index (BMI), and find evidence that height- and BMI-associated variants have been under natural selection. Considering imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60–70% for height and 30–40% for BMI. Therefore, missing heritability is small for both traits. For further gene discovery of complex traits, a design with SNP arrays followed by imputation is more cost-effective than WGS at current prices.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.