Polygenic risk scores (PRS) leverage the genetic contribution of an individual’s genotype to a complex trait by estimating disease risk. Traditional PRS prediction methods are predominantly for the European population. The accuracy of PRS prediction in non-European populations is diminished due to much smaller sample size of genome-wide association studies (GWAS). In this article, we introduced a novel method to construct PRS for non-European populations, abbreviated as TL-Multi, by conducting a transfer learning framework to learn useful knowledge from the European population to correct the bias for non-European populations. We considered non-European GWAS data as the target data and European GWAS data as the informative auxiliary data. TL-Multi borrows useful information from the auxiliary data to improve the learning accuracy of the target data while preserving the efficiency and accuracy. To demonstrate the practical applicability of the proposed method, we applied TL-Multi to predict the risk of systemic lupus erythematosus (SLE) in the Asian population and the risk of asthma in the Indian population by borrowing information from the European population. TL-Multi achieved better prediction accuracy than the competing methods, including Lassosum and meta-analysis in both simulations and real applications.
Motivation It is of scientific interest to identify DNA methylation CpG sites that might mediate the effect of an environmental exposure on a survival outcome in high-dimensional mediation analysis. However, there is a lack of powerful statistical methods that can provide a guarantee of false discovery rate (FDR) control in finite-sample settings. Results In this article, we propose a novel method called CoxMKF, which applies aggregation of multiple knockoffs to a Cox proportional hazards model for a survival outcome with high-dimensional mediators. The proposed CoxMKF can achieve FDR control even in finite-sample settings, which is particularly advantageous when the sample size is not large. Moreover, our proposed CoxMKF can overcome the randomness of the unstable model-X knockoffs. Our simulation results show that CoxMKF controls FDR well in finite samples. We further apply CoxMKF to a lung cancer data set from The Cancer Genome Atlas (TCGA) project with 754 subjects and 365 306 DNA methylation CpG sites, and identify four DNA methylation CpG sites that might mediate the effect of smoking on the overall survival among lung cancer patients. Availability The R package CoxMKF is publicly available at https://github.com/MinhaoYaooo/CoxMKF. Supplementary information Supplementary data are available at Bioinformatics online.
Polygenic risk scores (PRS) leverage the genetic contribution of an individual's genotype by estimating disease risk. Traditional PRS prediction methods are predominantly for European population. The accuracy of PRS prediction in non-European populations is diminished due to much smaller sample size of genome-wide association studies (GWAS). In this article, we introduced a novel method to construct PRS for non-European populations, abbreviated as TL-Multi, by conducting transfer learning framework to learn useful knowledge from European population to correct the bias for non-European populations. We considered non-European GWAS data as target data and European GWAS data as informative auxiliary data. TL-Multi borrowed useful information from auxiliary data to improve the learning accuracy of the target data while preserving the efficiency and accuracy. To demonstrate the practical applicability of the proposed method, we applied TL-Multi to predict systemic lupus erythematosus (SLE) risk in Hong Kong population by borrowing information from European population. TL-Multi achieved better prediction accuracy than alternative methods including Lassosum, meta-analysis and linkage disequilibrium (LD)-informed pruning and P-values thresholding for multiethnic PRS (PT-Multi), and substantially improved the prediction performance with moderate cross-population genetic correlation in both simulations and SLE application.
Venous thromboembolism (VTE) occurs in up to one third patients with COVID-19. VTE and COVID-19 may share a common genetic architecture in etiology, which has not been comprehensively investigated. In this study, we leveraged summary-level data from the latest COVID-19 host genetics consortium and UK Biobank to study the genetic commonality between COVID-19 traits and VTE. We found a positive genetic correlation between COVID-19 hospitalization and VTE (rg = 0.2320, P-value= 0.0092). The cross-trait analysis identified shared genetic loci between VTE and COVID-19 traits, including 8 for severe COVID-19, 11 for COVID-19 hospitalization, and 7 for SARS-CoV-2 infection. We identified seven novel mapped genes (LINC00970, TSPAN15, ADAMTS13, F5, DNAJB4, SLC39A8 and OBSCN) that were enriched for expression in the lung tissue, and in coagulation and immune related pathways. Eight genetic loci were found to share causal variants between COVID-19 and VTE, which are localized in the ABO, ADAMTS13 and FUT2 gene regions. Bi-directional Mendelian randomization analysis did not suggest a causal relationship between VTE and COVID-19 traits. Our study advances the understanding of shared genetic etiology of COVID-19 and VTE at the molecular and functional levels.
Venous thromboembolism occurs in up to one-third of patients with COVID-19. Venous thromboembolism and COVID-19 may share a common genetic architecture, which has not been clarified. To fill this gap, we leverage summary-level genetic data from the latest COVID‐19 host genetics consortium and UK Biobank and examine the shared genetic etiology and causal relationship between COVID-19 and venous thromboembolism. The cross-trait and co-localization analyses identify 2, 3, and 4 shared loci between venous thromboembolism and severe COVID-19, COVID-19 hospitalization, SARS-CoV-2 infection respectively, which are mapped to ABO, ADAMTS13, FUT2 genes involved in coagulation functions. Enrichment analysis supports shared biological processes between COVID-19 and venous thromboembolism related to coagulation and immunity. Bi-directional Mendelian randomization suggests that venous thromboembolism was associated with higher risk of three COVID-19 traits, and SARS-CoV-2 infection was associated with a higher risk of venous thromboembolism. Our study provides timely evidence for the genetic etiology between COVID-19 and venous thromboembolism (VTE). Our findings contribute to the understanding of COVID-19 and VTE etiology and provide insights into the prevention and comorbidity management of COVID-19.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.