2021
DOI: 10.1101/2021.11.18.21266545
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Global biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts

Abstract: SummaryWith the increasing availability of biobank-scale datasets that incorporate both genomic data and electronic health records, many associations between genetic variants and phenotypes of interest have been discovered. Polygenic risk scores (PRS), which are being widely explored in precision medicine, use the results of association studies to predict the genetic component of disease risk by accumulating risk alleles weighted by their effect sizes. However, limited studies have thoroughly investigated best… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
56
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

4
4

Authors

Journals

citations
Cited by 27 publications
(62 citation statements)
references
References 75 publications
6
56
0
Order By: Relevance
“…We design GWAS simulations where variants have different sample sizes, which is often the case when meta-analyzing GWAS summary statistics from multiple cohorts with different sets of variants (Wang et al ., 2021). Using 40,000 variants from chromosome 22 (Methods), we simulate quantitative phenotypes with a heritability of 20% and 2000 causal variants.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We design GWAS simulations where variants have different sample sizes, which is often the case when meta-analyzing GWAS summary statistics from multiple cohorts with different sets of variants (Wang et al ., 2021). Using 40,000 variants from chromosome 22 (Methods), we simulate quantitative phenotypes with a heritability of 20% and 2000 causal variants.…”
Section: Resultsmentioning
confidence: 99%
“…Note that LD blocks are already widely used by several methods (such as lassosum and PRS-CS) because they allow for processing smaller matrices at once (Mak et al ., 2017; Ge et al ., 2019); here we have shown that these blocks are also useful to make methods more robust. PRS-CS is currently one of the most robust PGS methods; for example, it can use the (small) 1000 Genomes data as LD reference (Ge et al ., 2019), and can even use a European LD reference panel with multi-ancestry GWAS summary statistics (Wang et al ., 2021). We think this is made possible by the use of a (very) strong regularization in PRS-CS (, which would approximately correspond to using s = 0.5 in lassosum, δ = 1 in lassosum2, and 0.5 in LDpred2-auto).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The ischemic stroke PRS did not significantly predict stroke in the HUNT dataset when added on top of the age and sex information. Additionally, in an accompanying paper by Wang et al, the performance of the GBMI derived PRS was compared to one derived using the MegaStroke summary statistics (Wang et al, 2021). Wang et al concluded that the previous meta-analysis, with more cases underlying the summary statistics, performed better for African ancestry individuals whereas the GBMI derived PRS was slightly better for European individuals.…”
Section: Discussionmentioning
confidence: 99%
“…Because of the diverse ancestry in the GWASs for meta-analysis, there was remarkable heterogeneity in the effective sample sizes across the genome-wide variants of the GBMI results of each phenotype. This heterogeneity affected the performance of downstream analyses, including those of polygenic risk score 65 . Therefore, we excluded variants with effective sample sizes <50% of the maximum effective sample size from the GWAS summary statistics of each phenotype.…”
Section: Methodsmentioning
confidence: 99%