2016
DOI: 10.1371/journal.pgen.1006288
|View full text |Cite
|
Sign up to set email alerts
|

Using Genetic Distance to Infer the Accuracy of Genomic Prediction

Abstract: The prediction of phenotypic traits using high-density genomic data has many applications such as the selection of plants and animals of commercial interest; and it is expected to play an increasing role in medical diagnostics. Statistical models used for this task are usually tested using cross-validation, which implicitly assumes that new individuals (whose phenotypes we would like to predict) originate from the same population the genomic prediction model is trained on. In this paper we propose an approach … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

10
100
1
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 128 publications
(112 citation statements)
references
References 53 publications
10
100
1
1
Order By: Relevance
“…In addition, this approach attains 30% relative improvement in prediction accuracy for height in an African cohort. These relative improvements are robust to overfitting, consistent with simulations and reduce the documented gap in risk prediction accuracy between European and non-European target populations (Bustamante, De La Vega, & Burchard, 2011; International Schizophrenia Consortium et al, 2009; Popejoy & Fullerton, 2016; Rosenberg et al, 2010; Scutari et al, 2016; Vilhjálmsson et al, 2015); we note that there are at least 35 phenotypes for which there are published GWAS data sets in Europeans and at least one non-European population (with minimum sample size of 8,000) that are listed in the NHGRI-EBI GWAS Catalog (MacArthur et al, 2017), where our approach could potentially be valuable (S21 Table). Intuitively, our approach leverages both large training sample sizes and training data with target-matched LD patterns.…”
Section: Discussionsupporting
confidence: 70%
See 2 more Smart Citations
“…In addition, this approach attains 30% relative improvement in prediction accuracy for height in an African cohort. These relative improvements are robust to overfitting, consistent with simulations and reduce the documented gap in risk prediction accuracy between European and non-European target populations (Bustamante, De La Vega, & Burchard, 2011; International Schizophrenia Consortium et al, 2009; Popejoy & Fullerton, 2016; Rosenberg et al, 2010; Scutari et al, 2016; Vilhjálmsson et al, 2015); we note that there are at least 35 phenotypes for which there are published GWAS data sets in Europeans and at least one non-European population (with minimum sample size of 8,000) that are listed in the NHGRI-EBI GWAS Catalog (MacArthur et al, 2017), where our approach could potentially be valuable (S21 Table). Intuitively, our approach leverages both large training sample sizes and training data with target-matched LD patterns.…”
Section: Discussionsupporting
confidence: 70%
“…Existing training data sets have much larger sample sizes in European populations, but the use of European training data for polygenic risk prediction in non-European populations reduces prediction accuracy, due to different patterns of linkage disequilibrium (LD) (or potentially due to different causal effects) (International Schizophrenia Consortium et al, 2009; Rosenberg et al, 2010; Scutari, Mackay, & Balding, 2016; Vilhjálmsson et al, 2015). For example, ref.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Although increasingly emphasised in the recent GWAS literature, it is worth noting that the "loss of accuracy" problem is not utterly new. Indeed, a number of studies in the animal breeding literature have previously reported lower accuracy of genomic selection across genetically distant breeds 4,5 , consistent with the observation of limited transferability of GWAS findings across diverse human populations 6,7 . These studies also highlight major factors influencing that loss such as differences between populations in causal variants effect sizes, in alleles frequencies and in linkage disequilibrium (LD) between causal variants and SNPs assayed in GWAS.…”
mentioning
confidence: 65%
“…For example, a previous study showed that the accuracy of breeding values and genomic prediction decays approximately linearly with increasing divergence between the discovery and target population. 8 Additionally, multiple individuals with African ancestry have received false positive misdiagnoses of hypertrophic cardiomyopathy that would have been prevented with the inclusion of even small numbers of African Americans in these studies. 9 Further, a previous study finding that 96% of GWAS participants are of European descent 1 has recently been updated; although the non-European proportion of GWAS participants has increased to nearly 20%, this is primarily driven by Asian individuals, and the proportion of individuals with African and Hispanic/Latino ancestry in GWASs has remained essentially unchanged.…”
Section: Introductionmentioning
confidence: 99%