Cathie Sudlow and colleagues describe the UK Biobank, a large population-based prospective study, established to allow investigation of the genetic and non-genetic determinants of the diseases of middle and old age.
There is increasing evidence that genome-wide association (GWA) studies represent a powerful approach to the identification of genes involved in common human diseases. We describe a joint GWA study (using the Affymetrix GeneChip 500K Mapping Array Set) undertaken in the British population, which has examined approximately 2,000 individuals for each of 7 major diseases and a shared set of approximately 3,000 controls. Case-control comparisons identified 24 independent association signals at
Inter-individual variation in mean leukocyte telomere length (LTL) is associated with cancer and several age-associated diseases. Here, in a genome-wide meta-analysis of 37,684 individuals with replication of selected variants in a further 10,739 individuals, we identified seven loci, including five novel loci, associated with mean LTL (P<5x10−8). Five of the loci contain genes (TERC, TERT, NAF1, OBFC1, RTEL1) that are known to be involved in telomere biology. Lead SNPs at two loci (TERC and TERT) associate with several cancers and other diseases, including idiopathic pulmonary fibrosis. Moreover, a genetic risk score analysis combining lead variants at all seven loci in 22,233 coronary artery disease cases and 64,762 controls showed an association of the alleles associated with shorter LTL with increased risk of CAD (21% (95% CI: 5–35%) per standard deviation in LTL, p=0.014). Our findings support a causal role of telomere length variation in some age-related diseases.
Elevated blood pressure is a common, heritable cause of cardiovascular disease worldwide. To date, identification of common genetic variants influencing blood pressure has proven challenging. We tested 2.5m genotyped and imputed SNPs for association with systolic and diastolic blood pressure in 34,433 subjects of European ancestry from the Global BPgen consortium and followed up findings with direct genotyping (N≤71,225 European ancestry, N=12,889 Indian Asian ancestry) and in silico comparison (CHARGE consortium, N=29,136). We identified association between systolic or diastolic blood pressure and common variants in 8 regions near the CYP17A1 (P=7×10−24), CYP1A2 (P=1×10−23), FGF5 (P=1×10−21), SH2B3 (P=3×10−18), MTHFR (P=2×10−13), c10orf107 (P=1×10−9), ZNF652 (P=5×10−9) and PLCD3 (P=1×10−8) genes. All variants associated with continuous blood pressure were associated with dichotomous hypertension. These associations between common variants and blood pressure and hypertension offer mechanistic insights into the regulation of blood pressure and may point to novel targets for interventions to prevent cardiovascular disease.
A population-based study of a quantitative trait may be seriously compromised when the trait is subject to the effects of a treatment. For example, in a typical study of quantitative blood pressure (BP) 15 per cent or more of middle-aged subjects may take antihypertensive treatment. Without appropriate correction, this can lead to substantial shrinkage in the estimated effect of aetiological determinants of scientific interest and a marked reduction in statistical power. Correction relies upon imputation, in treated subjects, of the underlying BP from the observed BP having invoked one or more assumptions about the bioclinical setting. There is a range of different assumptions that may be made, and a number of different analytical models that may be used. In this paper, we motivate an approach based on a censored normal regression model and compare it with a range of other methods that are currently used or advocated. We compare these methods in simulated data sets and assess the estimation bias and the loss of power that ensue when treatment effects are not appropriately addressed. We also apply the same methods to real data and demonstrate a pattern of behaviour that is consistent with that in the simulation studies. Although all approaches to analysis are necessarily approximations, we conclude that two of the adjustment methods appear to perform well across a range of realistic settings. These are: (1) the addition of a sensible constant to the observed BP in treated subjects; and (2) the censored normal regression model. A third, non-parametric, method based on averaging ordered residuals may also be advocated in some settings. On the other hand, three approaches that are used relatively commonly are fundamentally flawed and should not be used at all. These are: (i) ignoring the problem altogether and analysing observed BP in treated subjects as if it was underlying BP; (ii) fitting a conventional regression model with treatment as a binary covariate; and (iii) excluding treated subjects from the analysis. Given that the more effective methods are straightforward to implement, there is no argument for undertaking a flawed analysis that wastes power and results in excessive bias.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.