Portability of Tag SNPs Across Isolated Population Groups: An Example from India

Roy, N.; Farheen, Shabana; Sengupta, Sanghamitra; Majumder, Partha P.

doi:10.1111/j.1469-1809.2006.00383.x

Cited by 8 publications

(9 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One of these groups is sub‐Saharan African populations, who have considerably lower levels of LD than other populations (Reich et al 2001; Gabriel et al 2002; Tishkoff & Kidd, 2004; Hinds et al 2005; The International HapMap Consortium, 2005; Sawyer et al 2005; Conrad et al 2006) and who therefore require more tag SNPs to attain the same genomic coverage as can be obtained elsewhere. The other group consists of intermediate‐LD non‐African populations who are genetically distant from populations in the HapMap (Conrad et al 2006; Johansson et al 2007; Roy et al 2008). Such populations – found mainly in parts of Eurasia far from HapMap locations – do not benefit either from the relative ease of identifying tag SNPs in high‐LD populations using almost any low‐ or intermediate‐LD donor sample, or from the boost in tag performance supplied by a close genetic relationship to a HapMap population.…”

Section: Introductionmentioning

confidence: 99%

“…These linguistically defined groups were chosen from a larger survey of Indian genetic variation (Rosenberg et al 2006) to represent parts of India distant from other places in which haplotype variation has previously been more extensively studied. India has been largely omitted from genomic LD studies, and as a result of its intermediate location between Europe and East Asia, SNP variation in Indian groups is expected to be imperfectly captured by any single HapMap sample (Roy et al 2008). Thus, use of mixtures may have some potential for improving the prospects for genetic association studies in Indian populations.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Using Population Mixtures to Optimize the Utility of Genomic Databases: Linkage Disequilibrium and Association Study Design in India

Pemberton

Jakobsson

Conrad

et al. 2008

Annals of Human Genetics

View full text Add to dashboard Cite

SummaryWhen performing association studies in populations that have not been the focus of large-scale investigations of haplotype variation, it is often helpful to rely on genomic databases in other populations for study design and analysis -such as in the selection of tag SNPs and in the imputation of missing genotypes. One way of improving the use of these databases is to rely on a mixture of database samples that is similar to the population of interest, rather than using the single most similar database sample. We demonstrate the effectiveness of the mixture approach in the application of African, European, and East Asian HapMap samples for tag SNP selection in populations from India, a genetically intermediate region underrepresented in genomic studies of haplotype variation.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Using Population Mixtures to Optimize the Utility of Genomic Databases: Linkage Disequilibrium and Association Study Design in India

Pemberton

Jakobsson

Conrad

et al. 2008

Annals of Human Genetics

View full text Add to dashboard Cite

show abstract

“…We let x = p AB , y = p H B , and z = p H A and rewrite the formula for the correlation coefficient in terms of x, y, z. (13) We compute the gradient of f:…”

Section: Estimating Correlation Variance With the Delta Methodsmentioning

confidence: 99%

“…Second, the estimates of r 2 are very inaccurate which leads to inaccurate estimates of the power of association studies. Several groups have previously pointed out this limitation [10] and have performed empirical studies exploring this and the related issue of the transferability of tag SNPs to different populations [11][12][13] . Finally, the linkage structure is used by imputation methods to estimate frequencies of untyped SNPs in association studies.…”

Section: Introductionmentioning

confidence: 99%

Linkage Effects and Analysis of Finite Sample Errors in the HapMap

Zaitlen¹,

Kang

Eskin

2009

Hum Hered

View full text Add to dashboard Cite

The HapMap provides a valuable resource to help uncover genetic variants of important complex phenotypes such as disease risk and outcome. Using the HapMap we can infer the patterns of LD within different human populations. This is a critical step for determining which SNPs to genotype as part of a study, estimating study power, designing a follow-up study to identify the causal variants, ‘imputing’ untyped SNPs, and estimating recombination rates along the genome. Despite its tremendous importance, the HapMap suffers from the fundamental limitation that at most 60 unrelated individuals are available per population. We present an analytical framework for analyzing the implications of a finite sample HapMap. We present and justify simple approximations for deriving analytical estimates of important statistics such as the square of the correlation coefficient r² between two SNPs. Finally, we use this framework to show that current HapMap based estimates of r² and power have significant errors, and that tag sets highly overestimate their coverage. We show that a reasonable increase in the number of individuals, such as that proposed by the 1000 genomes project, greatly reduces the errors due to finite sample size for a large proportion of SNPs.

show abstract

“…However, identification of tag SNPs may vary with the method of haplotype construction (Niu, 2004). Further, tag SNPs selected from one population may not apply to other populations (Liu et al, 2004;Roy et al, 2007).…”

Section: Introductionmentioning

confidence: 99%

An approach to incorporate linkage disequilibrium structure into genomic association analysis

Zhang

Wagener

2008

Journal of Genetics and Genomics

View full text Add to dashboard Cite

In this study, we propose to use the principal component analysis (PCA) and regression model to incorporate linkage disequilibrium (LD) in genomic association data analysis. To accommodate LD in genomic data and reduce multiple testing, we suggest performing PCA and extracting the PCA score to capture the variation of genomic data, after which regression analysis is used to assess the association of the disease with the principal component score. An empirical analysis result shows that both genotype-based correlation matrix and haplotype-based LD matrix can produce similar results for PCA. Principal component score seems to be more powerful in detecting genetic association because the principal component score is quantitatively measured and may be able to capture the effect of multiple loci.

show abstract

Portability of Tag SNPs Across Isolated Population Groups: An Example from India

Cited by 8 publications

References 21 publications

Using Population Mixtures to Optimize the Utility of Genomic Databases: Linkage Disequilibrium and Association Study Design in India

Using Population Mixtures to Optimize the Utility of Genomic Databases: Linkage Disequilibrium and Association Study Design in India

Linkage Effects and Analysis of Finite Sample Errors in the HapMap

An approach to incorporate linkage disequilibrium structure into genomic association analysis

Contact Info

Product

Resources

About