2016
DOI: 10.1101/058214
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Polygenic scores via penalized regression on summary statistics

Abstract: Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype.They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating polygenic scores have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD)… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 65 publications
0
3
0
Order By: Relevance
“…Previous studies on the lack of transferability of PGS have generally estimated scores using summary statistics from genome-wide association studies of single-ancestry populations [26, 25, 12]. These summary statistic approaches are often highly efficient computationally and typically achieve highly competitive predictive performance relative to full genotype approaches [21, 23, 43, 35]. However, combining such data such raises complexities, including assumptions that have to be made about the correlation structure of untyped variants and the comparability of phenotype definitions.…”
Section: Discussionmentioning
confidence: 99%
“…Previous studies on the lack of transferability of PGS have generally estimated scores using summary statistics from genome-wide association studies of single-ancestry populations [26, 25, 12]. These summary statistic approaches are often highly efficient computationally and typically achieve highly competitive predictive performance relative to full genotype approaches [21, 23, 43, 35]. However, combining such data such raises complexities, including assumptions that have to be made about the correlation structure of untyped variants and the comparability of phenotype definitions.…”
Section: Discussionmentioning
confidence: 99%
“…Sample sizes are often in the thousands to the hundreds of thousands, while the number of variants can be in the millions, making tools from classical statistics like multiple regression impossible to apply. In principle, Bayesian methods or regularization methods such as the LASSO [31, 57] or ridge regression [24, 59] can make the original ill-posed problem well-posed. Yet, without a solid understanding of the distribution of effect sizes, choosing the form and amount of regularization can be difficult.…”
Section: Introductionmentioning
confidence: 99%
“…A related line of work predicts phenotype from genotype using so-called “polygenic scores” (PGSs) or “polygenic risk scores”. State-of-the-art approaches use some form of explicit regularization like the LASSO [31], or perform Bayesian inference, where an assumed distribution of effect sizes is used as a prior and acts as a regularizer. These methods typically specify a particular family of priors such a Normal with a point mass at zero [59], mixture of a small number of Normals [30], or a particular scale-mixture of Normals [22], and the user is required to choose a distribution from this family by tuning a hyperparameter using a held-out validation dataset.…”
Section: Introductionmentioning
confidence: 99%