ObjectiveTo examine sex and gender roles in COVID-19 test positivity and hospitalisation in sex-stratified predictive models using machine learning.DesignCross-sectional study.SettingUK Biobank prospective cohort.ParticipantsParticipants tested between 16 March 2020 and 18 May 2020 were analysed.Main outcome measuresThe endpoints of the study were COVID-19 test positivity and hospitalisation. Forty-two individuals’ demographics, psychosocial factors and comorbidities were used as likely determinants of outcomes. Gradient boosting machine was used for building prediction models.ResultsOf 4510 individuals tested (51.2% female, mean age=68.5±8.9 years), 29.4% tested positive. Males were more likely to be positive than females (31.6% vs 27.3%, p=0.001). In females, living in more deprived areas, lower income, increased low-density lipoprotein (LDL) to high-density lipoprotein (HDL) ratio, working night shifts and living with a greater number of family members were associated with a higher likelihood of COVID-19 positive test. While in males, greater body mass index and LDL to HDL ratio were the factors associated with a positive test. Older age and adverse cardiometabolic characteristics were the most prominent variables associated with hospitalisation of test-positive patients in both overall and sex-stratified models.ConclusionHigh-risk jobs, crowded living arrangements and living in deprived areas were associated with increased COVID-19 infection in females, while high-risk cardiometabolic characteristics were more influential in males. Gender-related factors have a greater impact on females; hence, they should be considered in identifying priority groups for COVID-19 infection vaccination campaigns.
Sharing health data for research purposes across international jurisdictions has been a challenge due to privacy concerns. Two privacy enhancing technologies that can enable such sharing are synthetic data generation (SDG) and federated analysis, but their relative strengths and weaknesses have not been evaluated thus far. In this study we compared SDG with federated analysis to enable such international comparative studies. The objective of the analysis was to assess country-level differences in the role of sex on cardiovascular health (CVH) using a pooled dataset of Canadian and Austrian individuals. The Canadian data was synthesized and sent to the Austrian team for analysis. The utility of the pooled (synthetic Canadian + real Austrian) dataset was evaluated by comparing the regression results from the two approaches. The privacy of the Canadian synthetic data was assessed using a membership disclosure test which showed an F1 score of 0.001, indicating low privacy risk. The outcome variable of interest was CVH, calculated through a modified CANHEART index. The main and interaction effect parameter estimates of the federated and pooled analyses were consistent and directionally the same. It took approximately one month to set up the synthetic data generation platform and generate the synthetic data, whereas it took over 1.5 years to set up the federated analysis system. Synthetic data generation can be an efficient and effective tool for enabling multi-jurisdictional studies while addressing privacy concerns.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.