Evaluation of methods incorporating biological function and GWAS summary statistics to accelerate discovery

Moore, Amy; Marks, Jesse; Quach, Bryan C.; Guo, Yuelong; Bierut, Laura J.; Gaddis, Nathan; Hancock, Dana B.; Page, Grier P.; Johnson, Eric O.

doi:10.1101/2022.01.10.475153

Cited by 3 publications

(9 citation statements)

References 91 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While machine learning out-performed individual methods and even linear combinations of methods, it only correctly identified an additional 1-8 genes, and only for highly heritable traits with a well-powered smaller GWAS (SCZ and ICV). This observation converges with related work evaluating methods for SNP and locus annotation (13,14,48), which has concluded that such methods can only marginally increase the number of true-positive observations. Similarly, prior work examining positional gene-based methods in a simulated trait with relatively few causal SNPs (n=602) observed a tradeoff between sensitivity and specificity (49), which is also seen here.…”

Section: Discussionsupporting

confidence: 88%

“…The gene-level focus reflects an additional weakness of the omics-integration approach, in that it could lead to improved knowledge of biology without narrowing down the identity of causal loci in human populations. Thus, while recent related studies using locus-level analyses yielded similar findings (14), we cannot rule out that alternative methods could have led to stronger results. We note some additional limitations of the present study.…”

Section: Discussionmentioning

confidence: 59%

“…Analyses focused on trait associations at the gene level, rather than individual SNPs (14). This enabled the additional integration of multi-omics data that did not use GWAS information, including results from post mortem studies of gene expression in patients and rodent models, and a gene-set for AUD/PAU that integrates information from a variety of rodent data sources (12).…”

Section: Discussionmentioning

confidence: 99%

“…As greater evidence arises for genome-wide significant signals to be enriched in regulatory regions, these multi-omics data have proved to be valuable in gene prioritization. There has also been speculation that leveraging these everincreasing omics sources might "recover" true signal from smaller GWAS, i.e., signals that may not meet criteria for genome-wide significance but are supported by multi-omics data, and eventually are identified as GWAS become larger, thus serving as a substitute for additional sample size (13)(14)(15).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Multi-omics analyses cannot identify true-positive novel associations from underpowered genome-wide association studies of four brain-related traits

Baranger

Hatoum

Polimanti

et al. 2022

Preprint

View full text Add to dashboard Cite

Background: The integration of multi-omics information (e.g., epigenetics and transcriptomics) can be useful for interpreting findings from genome-wide association studies (GWAS). It has additionally been suggested that multi-omics may aid in novel variant discovery, thus circumventing the need to increase GWAS sample sizes. We tested whether incorporating multi-omics information in earlier and smaller sized GWAS boosts true-positive discovery of genes that were later revealed by larger GWAS of the same/similar traits. Methods: We applied ten different analytic approaches to integrating multi-omics data from twelve sources (e.g., Genotype-Tissue Expression project) to test whether earlier and smaller GWAS of 4 brain-related traits (i.e., alcohol use disorder/problematic alcohol use [AUD/PAU], major depression [MDD], schizophrenia [SCZ], and intracranial volume [ICV]) could detect genes that were revealed by a later and larger GWAS. Results: Multi-omics data did not reliably identify novel genes in earlier less powered GWAS (PPV<0.2; 80% false-positive associations). Machine learning predictions marginally increased the number of identified novel genes, correctly identifying 1-8 additional genes, but only for well-powered early GWAS of highly heritable traits (i.e., ICV and SCZ). Multi-omics, particularly positional mapping (i.e., fastBAT, MAGMA, and H-MAGMA), was useful for prioritizing genes within genome-wide significant loci (PPVs = 0.5-1.0). Conclusions: Although the integration of multi-omics information, particularly when multiple methods agree, helps prioritize GWAS findings and translate them into information about disease biology, it does not substantively increase novel gene discovery in brain-related GWAS. To increase power for discovery of novel genes and loci, increasing sample size is a requirement.

show abstract

Section: Discussionsupporting

confidence: 88%

Section: Discussionmentioning

confidence: 59%

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Multi-omics analyses cannot identify true-positive novel associations from underpowered genome-wide association studies of four brain-related traits

Baranger

Hatoum

Polimanti

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…15,16 Some have speculated that multi-omics could serve as a substitute for additional sample size, and "recover" true signal from smaller GWAS by increasing support for signals that do not meet criteria for genome-wide significance. [17][18][19] This is an appealing proposition, because it would reduce the need (and expense) for collection and phenotyping of larger samples. Some early work using expressionbased methods (e.g., transcriptome-wise association analyses -TWAS) found that they can identify genes that would eventually achieve genome-wide significance in subsequent, larger GWAS, 20,21 although these studies did not evaluate what proportion of all novel genes were first identified by TWAS.…”

Section: Introductionmentioning

confidence: 99%

Multi‐omics cannot replace sample size in genome‐wide association studies

Baranger

Hatoum

Polimanti

et al. 2023

Genes Brain and Behavior

View full text Add to dashboard Cite

The integration of multi‐omics information (e.g., epigenetics and transcriptomics) can be useful for interpreting findings from genome‐wide association studies (GWAS). It has been suggested that multi‐omics could circumvent or greatly reduce the need to increase GWAS sample sizes for novel variant discovery. We tested whether incorporating multi‐omics information in earlier and smaller‐sized GWAS boosts true‐positive discovery of genes that were later revealed by larger GWAS of the same/similar traits. We applied 10 different analytic approaches to integrating multi‐omics data from 12 sources (e.g., Genotype‐Tissue Expression project) to test whether earlier and smaller GWAS of 4 brain‐related traits (alcohol use disorder/problematic alcohol use, major depression/depression, schizophrenia, and intracranial volume/brain volume) could detect genes that were revealed by a later and larger GWAS. Multi‐omics data did not reliably identify novel genes in earlier less‐powered GWAS (PPV <0.2; 80% false‐positive associations). Machine learning predictions marginally increased the number of identified novel genes, correctly identifying 1–8 additional genes, but only for well‐powered early GWAS of highly heritable traits (i.e., intracranial volume and schizophrenia). Although multi‐omics, particularly positional mapping (i.e., fastBAT, MAGMA, and H‐MAGMA), can help to prioritize genes within genome‐wide significant loci (PPVs = 0.5–1.0) and translate them into information about disease biology, it does not reliably increase novel gene discovery in brain‐related GWAS. To increase power for discovery of novel genes and loci, increasing sample size is required.

show abstract

Integrating eQTL and GWAS data characterises established and identifies novel migraine risk loci

Ghaffar¹,

Nyholt²

2023

Hum. Genet.

View full text Add to dashboard Cite

Migraine—a painful, throbbing headache disorder—is the most common complex brain disorder, yet its molecular mechanisms remain unclear. Genome-wide association studies (GWAS) have proven successful in identifying migraine risk loci; however, much work remains to identify the causal variants and genes. In this paper, we compared three transcriptome-wide association study (TWAS) imputation models—MASHR, elastic net, and SMultiXcan—to characterise established genome-wide significant (GWS) migraine GWAS risk loci, and to identify putative novel migraine risk gene loci. We compared the standard TWAS approach of analysing 49 GTEx tissues with Bonferroni correction for testing all genes present across all tissues (Bonferroni), to TWAS in five tissues estimated to be relevant to migraine, and TWAS with Bonferroni correction that took into account the correlation between eQTLs within each tissue (Bonferroni-matSpD). Elastic net models performed in all 49 GTEx tissues using Bonferroni-matSpD characterised the highest number of established migraine GWAS risk loci (n = 20) with GWS TWAS genes having colocalisation (PP4 > 0.5) with an eQTL. SMultiXcan in all 49 GTEx tissues identified the highest number of putative novel migraine risk genes (n = 28) with GWS differential expression at 20 non-GWS GWAS loci. Nine of these putative novel migraine risk genes were later found to be at and in linkage disequilibrium with true (GWS) migraine risk loci in a recent, more powerful migraine GWAS. Across all TWAS approaches, a total of 62 putative novel migraine risk genes were identified at 32 independent genomic loci. Of these 32 loci, 21 were true risk loci in the recent, more powerful migraine GWAS. Our results provide important guidance on the selection, use, and utility of imputation-based TWAS approaches to characterise established GWAS risk loci and identify novel risk gene loci.

show abstract

Evaluation of methods incorporating biological function and GWAS summary statistics to accelerate discovery

Cited by 3 publications

References 91 publications

Multi-omics analyses cannot identify true-positive novel associations from underpowered genome-wide association studies of four brain-related traits

Multi-omics analyses cannot identify true-positive novel associations from underpowered genome-wide association studies of four brain-related traits

Multi‐omics cannot replace sample size in genome‐wide association studies

Integrating eQTL and GWAS data characterises established and identifies novel migraine risk loci

Contact Info

Product

Resources

About