To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at approximately 160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (F(ST) = 0.0002 approximately 0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (F(ST) > 0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p < 10(-101)). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.
Genetic studies of Tibetans, an ethnic group with a long-lasting presence on the Tibetan Plateau which is known as the highest plateau in the world, may offer a unique opportunity to understand the biological adaptations of human beings to high-altitude environments. We conducted a genome-wide study of 1,000,000 genetic variants in 46 Tibetans (TBN) and 92 Han Chinese (HAN) for identifying the signals of high-altitude adaptations (HAAs) in Tibetan genomes. We discovered the most differentiated variants between TBN and HAN at chromosome 1q42.2 and 2p21. EGLN1 (or HIFPH2, MIM 606425) and EPAS1 (or HIF2A, MIM 603349), both related to hypoxia-inducible factor, were found most differentiated in the two regions, respectively. Strong positive correlations were also observed between the frequency of TBN-dominant haplotypes in the two gene regions and altitude in East Asian populations. Linkage disequilibrium and further haplotype network analyses of world-wide populations suggested the antiquity of the TBN-dominant haplotypes and long-term persistence of the natural selection. Finally, a "dominant haplotype carrier" hypothesis could describe the role of the two genes in HAA. All of our population genomic and statistical analyses indicate that EPAS1 and EGLN1 are most likely responsible for HAA of Tibetans. Interestingly, one each but not both of the two genes were also identified by three recent studies. We reanalyzed the available data and found the escaped top signal (EPAS1) could be recaptured with data quality control and our approaches. Based on this experience, we call for more attention to be paid to controlling data quality and batch effects introduced in public data integration. Our results also suggest limitations of extended haplotype homozygosity-based method due to its compromised power in case the natural selection initiated long time ago and particularly in genomic regions with recombination hotspots.
Demographic change of human populations is one of the central questions for delving into the past of human beings. To identify major population expansions related to male lineages, we sequenced 78 East Asian Y chromosomes at 3.9 Mbp of the non-recombining region, discovered >4,000 new SNPs, and identified many new clades. The relative divergence dates can be estimated much more precisely using a molecular clock. We found that all the Paleolithic divergences were binary; however, three strong star-like Neolithic expansions at ∼6 kya (thousand years ago) (assuming a constant substitution rate of 1×10−9/bp/year) indicates that ∼40% of modern Chinese are patrilineal descendants of only three super-grandfathers at that time. This observation suggests that the main patrilineal expansion in China occurred in the Neolithic Era and might be related to the development of agriculture.
Checkpoint blockade-mediated immunotherapy is emerging as an effective treatment modality for multiple cancer types. However, cancer cells frequently evade the immune system, compromising the effectiveness of immunotherapy. It is crucial to develop screening methods to identify the patients who would most benefit from these therapies because of the risk of the side effects and the high cost of treatment. Here we show that expression of the MHC class I transactivator (CITA), NLRC5, is important for efficient responses to anti-CTLA-4 and anti-PD1 checkpoint blockade therapies. Melanoma tumors derived from patients responding to immunotherapy exhibited significantly higher expression of NLRC5 and MHC class I-related genes compared to non-responding patients. In addition, multivariate analysis that included the number of tumor-associated non-synonymous mutations, predicted neo-antigen load and PD-L2 expression was capable of further stratifying responders and non-responders to anti-CTLA4 therapy. Moreover, expression or methylation of NLRC5 together with total somatic mutation number were significantly correlated with increased patient survival. These results suggest that NLRC5 tumor expression, alone or together with tumor mutation load constitutes a valuable predictive biomarker for both prognosis and response to anti-CTLA-4 and potentially anti-PD1 blockade immunotherapy in melanoma patients.
De novo mutations (DNMs) are increasingly recognized as rare disease causal factors. Identifying DNM carriers will allow researchers to study the likely distinct molecular mechanisms of DNMs. We developed Famdenovo to predict DNM status (DNM or familial mutation [FM]) of deleterious autosomal dominant germline mutations for any syndrome. We introduce Famdenovo.TP53 for Li-Fraumeni syndrome (LFS) and analyze 324 LFS family pedigrees from four US cohorts: a validation set of 186 pedigrees and a discovery set of 138 pedigrees. The concordance index for Famdenovo.TP53 prediction was 0.95 (95% CI: [0.92, 0.98]). Forty individuals (95% CI: [30, 50]) were predicted as DNM carriers, increasing the total number from 42 to 82. We compared clinical and biological features of FM versus DNM carriers: (1) cancer and mutation spectra along with parental ages were similarly distributed; (2) ascertainment criteria like early-onset breast cancer (age 20-35 yr) provides a condition for an unbiased estimate of the DNM rate: 48% (23 DNMs vs. 25 FMs); and (3) hotspot mutation R248W was not observed in DNMs, although it was as prevalent as hotspot mutation R248Q in FMs. Furthermore, we introduce Famdenovo.BRCA for hereditary breast and ovarian cancer syndrome and apply it to a small set of family data from the Cancer Genetics Network. In summary, we introduce a novel statistical approach to systematically evaluate deleterious DNMs in inherited cancer syndromes. Our approach may serve as a foundation for future studies evaluating how new deleterious mutations can be established in the germline, such as those in TP53.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.