2021
DOI: 10.1186/s13059-021-02479-9
|View full text |Cite
|
Sign up to set email alerts
|

PUMAS: fine-tuning polygenic risk scores with GWAS summary statistics

Abstract: Polygenic risk scores (PRSs) have wide applications in human genetics research, but often include tuning parameters which are difficult to optimize in practice due to limited access to individual-level data. Here, we introduce PUMAS, a novel method to fine-tune PRS models using summary statistics from genome-wide association studies (GWASs). Through extensive simulations, external validations, and analysis of 65 traits, we demonstrate that PUMAS can perform various model-tuning procedures using GWAS summary st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
39
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 36 publications
(39 citation statements)
references
References 56 publications
0
39
0
Order By: Relevance
“…Finally, we introduce an innovative strategy to linearly combine multiple PRS trained in different populations using summary association data alone. We employ a summary statistics-based repeated learning approach motivated from our recent work 45 to estimate the regression weights for combining multiple PRS. The entire X-Wing procedure only requires GWAS summary data and LD references as input, which is a major advance compared to existing approaches.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Finally, we introduce an innovative strategy to linearly combine multiple PRS trained in different populations using summary association data alone. We employ a summary statistics-based repeated learning approach motivated from our recent work 45 to estimate the regression weights for combining multiple PRS. The entire X-Wing procedure only requires GWAS summary data and LD references as input, which is a major advance compared to existing approaches.…”
Section: Resultsmentioning
confidence: 99%
“…Typically, repeated learning (or cross-validation) requires individual-level genotype and phenotype data since it involves sample splitting. Generalizing the technique in our recent work 45 , we introduce a summary statistics-based repeated learning strategy, which does not need individual-level GWAS data. This approach has three main steps which we describe below.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The method requires three independent datasets: (1) GWAS summary statistics from training datasets across EUR and non-EUR populations; (2) a tuning dataset for the target population to find optimal model parameters; and (3) a validation dataset for the target population to report the final prediction performance. While this report assumes that individual-level data are available for model tuning and validation, summary-statistics-based methods 38,39 could also be used in these steps.…”
Section: Resultsmentioning
confidence: 99%
“…Methodological challenges in computing PRS reside in estimating the highly polygenic yet typically weak SNP effects for most complex traits and accounting for extensive LD in the human genome. Recently, penalized regression models re-estimate SNP effects from GWAS summary statistics while explicitly modeling LD have been shown to effectively improve the predictive performance of PRS [101][102][103], and novel resampling approaches now allow model fine-tuning without individual-level genotype and phenotype data [104]. Additionally, Khera et al convincingly demonstrated that individuals with very high PRS show substantially elevated coronary artery disease risk that is comparable to having monogenic mutations with large effects [105].…”
Section: Disease Risk Predictionmentioning
confidence: 99%