The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.
The genetic architecture of brain structure and function is largely unknown. To investigate this, we carried out genome-wide association studies of 3,144 functional and structural brain imaging phenotypes from UK Biobank (discovery dataset 8,428 subjects). Here we show that many of these phenotypes are heritable. We identify 148 clusters of associations between single nucleotide polymorphisms and imaging phenotypes that replicate at P < 0.05, when we would expect 21 to replicate by chance. Notable significant, interpretable associations include: iron transport and storage genes, related to magnetic susceptibility of subcortical brain tissue; extracellular matrix and epidermal growth factor genes, associated with white matter micro-structure and lesions; genes that regulate mid-line axon development, associated with organization of the pontine crossing tract; and overall 17 genes involved in development, pathway signalling and plasticity. Our results provide insights into the genetic architecture of the brain that are relevant to neurological and psychiatric disorders, brain development and ageing.
The UK Biobank project is a large prospective cohort study of ~500,000 individuals from across the United Kingdom, aged between 40-69 at recruitment. A rich variety of phenotypic and health-related information is available on each participant, making the resource unprecedented in its size and scope. Here we describe the genome-wide genotype data (~805,000 markers) collected on all individuals in the cohort and its quality control procedures. Genotype data on this scale offers novel opportunities for assessing quality issues, although the wide range of ancestries of the individuals in the cohort also creates particular challenges. We also conducted a set of analyses that reveal properties of the genetic data -such as population structure and relatedness -that can be important for downstream analyses. In addition, we phased and imputed genotypes into the dataset, using computationally efficient methods combined with the Haplotype Reference Consortium (HRC) and UK10K haplotype resource. This increases the number of testable variants by over 100-fold to ~96 million variants. We also imputed classical allelic variation at 11 human leukocyte antigen (HLA) genes, and as a quality control check of this imputation, we replicate signals of known associations between HLA alleles and many common diseases. We describe tools that allow efficient genome-wide association studies (GWAS) of multiple traits and fast phenome-wide association studies (PheWAS), which work together with a new compressed file format that has been used to distribute the dataset. As a further check of the genotyped and imputed datasets, we performed a test-case genome-wide association scan on a well-studied human trait, standing height.
UK Biobank is a major prospective epidemiological study, including multimodal brain imaging, genetics and ongoing health outcomes. Previously, we published genome-wide associations of 3,144 brain imaging-derived phenotypes, with a discovery sample of 8,428 subjects. Here we present a new open resource of GWAS summary statistics, using the 2020 data release, almost tripling the discovery sample size. We now include the X chromosome, and new classes of image derived phenotypes (subcortical volumes and tissue contrast). Previously we had found 148 replicated clusters of associations between genetic variants and imaging phenotypes; here we find 692, including 12 on the X chromosome. We describe some of the newly found associations, focussing on the X chromosome and autosomal associations involving the new classes of imaging-derived phenotypes. Our novel associations implicate e.g. pathways involved in the rare X-linked syndrome STAR (syndactyly, telecanthus and anogenital and renal malformations), Alzheimer’s disease and mitochondrial disorders.
Brain imaging can be used to study how individuals' brains are aging, compared against population norms. This can inform on aspects of brain health; for example, smoking and blood pressure can be seen to accelerate brain aging. Typically, a single "brain age" is estimated per subject, whereas here we we identified 62 modes of subject variability, from 21,407 subjects' multimodal brain imaging data in UK Biobank. The modes represent different aspects of brain aging, showing distinct patterns of functional and structural brain change, and distinct patterns of association with genetics, lifestyle, cognition, physical measures and disease. While conventional brain-age modelling found no genetic associations, 34 modes had genetic associations. We suggest that it is important not to treat brain aging as a single homogeneous process, and that modelling of distinct patterns of structural and functional change will reveal more biologically meaningful markers of brain aging in health and disease. KeywordsBrain aging, brain imaging, UK Biobank.Corresponding author: Stephen Smith, steve@fmrib.ox.ac.uk IntroductionBrain imaging can be used to predict "brain age" -the apparent age of individuals' brains -by comparing their imaging data against a normative population dataset. The difference between brain age and actual chronological age (the "delta", or "brain age gap") is often then computed, providing a measure of whether a subject's brain appears to have aged more (or less) than the average age-matched population data. For example, looking at structural magnetic resonance imaging (MRI) data, a high degree of atrophy would cause a subject's brain to appear older than a normal age-matched brain. Estimation of brain age and the delta is of value in studying both normal aging and disease, with some diseases, such as Alzheimer's disease, showing similar patterns of change to that of accelerated healthy aging [Franke et al., 2010, Cole and Franke, 2017.The typical approach uses one or more imaging modalities, most commonly using just a single structural image from each subject. The data is then preprocessed, and features identified, for use in the brain age prediction. For example, the structural images may be warped into a standard space, and grey matter segmentation carried out; the voxelwise segmentation values themselves can then be the features. Alternatively, a smaller number of more highly-condensed features may be derived, such as volumes of grey and white matter within multiple brain regions. The resulting dataset, of multiple subjects' feature sets, along with their true ages, is then passed into a supervised-learning algorithm (e.g., regression, support vector machine or deep learning). The algorithm then learns to predict the subjects' ages from their brain imaging features. Finally, the true age is typically subtracted from the estimated brain age, to create a delta, potentially with corrections for biases such as systematic mis-estimation of brain age [Le et al., 2018.The imaging feature set can be derived from more ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.