The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in unprecedented detail, but presents new statistical challenges. We propose a novel inference framework that aims to efficiently capture information on population structure provided by patterns of haplotype similarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes are reconstructed using chunks of DNA donated by the other individuals. Results of this “chromosome painting” can be summarized as a “coancestry matrix,” which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA) and model-based approaches such as STRUCTURE in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and we identify 226 populations reflecting differences on continental, regional, local, and family scales. We present multiple lines of evidence that, while many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE, is available from http://www.paintmychromosomes.com/.
Multiple sclerosis (OMIM 126200) is a common disease of the central nervous system in which the interplay between inflammatory and neurodegenerative processes typically results in intermittent neurological disturbance followed by progressive accumulation of disability.1 Epidemiological studies have shown that genetic factors are primarily responsible for the substantially increased frequency of the disease seen in the relatives of affected individuals;2,3 and systematic attempts to identify linkage in multiplex families have confirmed that variation within the Major Histocompatibility Complex (MHC) exerts the greatest individual effect on risk.4 Modestly powered Genome-Wide Association Studies (GWAS)5-10 have enabled more than 20 additional risk loci to be identified and have shown that multiple variants exerting modest individual effects play a key role in disease susceptibility.11 Most of the genetic architecture underlying susceptibility to the disease remains to be defined and is anticipated to require the analysis of sample sizes that are beyond the numbers currently available to individual research groups. In a collaborative GWAS involving 9772 cases of European descent collected by 23 research groups working in 15 different countries, we have replicated almost all of the previously suggested associations and identified at least a further 29 novel susceptibility loci. Within the MHC we have refined the identity of the DRB1 risk alleles and confirmed that variation in the HLA-A gene underlies the independent protective effect attributable to the Class I region. Immunologically relevant genes are significantly over-represented amongst those mapping close to the identified loci and particularly implicate T helper cell differentiation in the pathogenesis of multiple sclerosis.
Modern genetic data combined with appropriate statistical methods have the potential to contribute substantially to our understanding of human history. We have developed an approach that exploits the genomic structure of admixed populations to date and characterize historical mixture events at fine scales. We used this to produce an atlas of worldwide human admixture history, constructed using genetic data alone and encompassing over 100 events occurring over the past 4,000 years. We identify events whose dates and participants suggest they describe genetic impacts of the Mongol Empire, Arab slave trade, Bantu expansion, first millennium CE migrations in eastern Europe, and European colonialism, as well as unrecorded events, revealing admixture to be an almost universal force shaping human populations.
Summary To gain further insight into the genetic architecture of psoriasis, we conducted a meta-analysis of three genome-wide association studies (GWAS) and two independent datasets genotyped on the Immunochip, involving 10,588 cases and 22,806 controls in total. We identified 15 new disease susceptibility regions, increasing the number of psoriasis-associated loci to 36 for Caucasians. Conditional analyses identified five independent signals within previously known loci. The newly identified shared disease regions encompassed a number of genes whose products regulate T-cell function (e.g. RUNX3, TAGAP and STAT3). The new psoriasis-specific regions were notable for candidate genes whose products are involved in innate host defense, encoding proteins with roles in interferon-mediated antiviral responses (DDX58), macrophage activation (ZC3H12C), and NF-κB signaling (CARD14 and CARM1). These results portend a better understanding of shared and distinctive genetic determinants of immune-mediated inflammatory disorders and emphasize the importance of the skin in innate and acquired host defense.
Summary Fine-scale genetic variation between human populations is interesting as a signature of historical demographic events and because of its potential for confounding disease studies. We use haplotype-based statistical methods to analyse genome-wide SNP data from a carefully chosen geographically diverse sample of 2,039 individuals from the United Kingdom (UK). This reveals a rich and detailed pattern of genetic differentiation with remarkable concordance between genetic clusters and geography. The regional genetic differentiation and differing patterns of shared ancestry with 6,209 individuals from across Europe carry clear signals of historical demographic events. We estimate the genetic contribution to SE England from Anglo-Saxon migrations to be under half, identify the regions not carrying genetic material from these migrations, suggest significant pre-Roman but post-Mesolithic movement into SE England from the Continent, and show that in non-Saxon parts of the UK there exist genetically differentiated subgroups rather than a general “Celtic” population.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.