IntroductionThe emergence of the novel respiratory SARS-CoV-2 and subsequent COVID-19 pandemic have required rapid assimilation of population-level data to understand and control the spread of infection in the general and vulnerable populations. Rapid analyses are needed to inform policy development and target interventions to at-risk groups to prevent serious health outcomes. We aim to provide an accessible research platform to determine demographic, socioeconomic and clinical risk factors for infection, morbidity and mortality of COVID-19, to measure the impact of COVID-19 on healthcare utilisation and long-term health, and to enable the evaluation of natural experiments of policy interventions.Methods and analysisTwo privacy-protecting population-level cohorts have been created and derived from multisourced demographic and healthcare data. The C20 cohort consists of 3.2 million people in Wales on the 1 January 2020 with follow-up until 31 May 2020. The complete cohort dataset will be updated monthly with some individual datasets available daily. The C16 cohort consists of 3 million people in Wales on the 1 January 2016 with follow-up to 31 December 2019. C16 is designed as a counterfactual cohort to provide contextual comparative population data on disease, health service utilisation and mortality. Study outcomes will: (a) characterise the epidemiology of COVID-19, (b) assess socioeconomic and demographic influences on infection and outcomes, (c) measure the impact of COVID-19 on short -term and longer-term population outcomes and (d) undertake studies on the transmission and spatial spread of infection.Ethics and disseminationThe Secure Anonymised Information Linkage-independent Information Governance Review Panel has approved this study. The study findings will be presented to policy groups, public meetings, national and international conferences, and published in peer-reviewed journals.
We analyse global data for COVID-19 deaths and recoveries and show that outbreak severity displays a striking latitude relationship with a northern hemisphere bias. Transmission rates can be explained by seasonal weather conditions, but this does not account for observed variations in fatality rates. Many factors point to Vitamin D as a candidate explanation but historical controversy surrounding Vitamin D studies and the lack of a coherent framework for causal inference has hampered acceptance of this explanation despite a wealth of evidence in its favour.We analyse global COVID-19 data using Causal Inference, constructing two contrasting directed acyclic graph (DAG) models, one causal and one acausal, and set out clearly multiple predictions made by each model. We show that observed data strongly match predictions made by the causal model but largely contradict those of the acausal model. We explore historic evidence further supporting the causal model.We review biochemical mechanisms that may explain the various ways in which vitamin D acts. We detail the mechanisms by which the SARS-Cov-2 virus causes the disease and known pathways that involve Vitamin D and show how these both protect against viral infection, as well as ameliorating disease symptoms in COVID-19 and other respiratory diseases.We examine the factors that govern confidence in causal inference models and conclude that a high level of confidence in a causal beneficial role for Vitamin D is justified.
Background The CVD-COVID-UK consortium was formed to understand the relationship between COVID-19 and cardiovascular diseases through analyses of harmonised electronic health records (EHRs) across the four UK nations. Beyond COVID-19, data harmonisation and common approaches enable analysis within and across independent Trusted Research Environments. Here we describe the reproducible harmonisation method developed using large-scale EHRs in Wales to accommodate the fast and efficient implementation of cross-nation analysis in England and Wales as part of the CVD-COVID-UK programme. We characterise current challenges and share lessons learnt. Methods Serving the scope and scalability of multiple study protocols, we used linked, anonymised individual-level EHR, demographic and administrative data held within the SAIL Databank for the population of Wales. The harmonisation method was implemented as a four-layer reproducible process, starting from raw data in the first layer. Then each of the layers two to four is framed by, but not limited to, the characterised challenges and lessons learnt. We achieved curated data as part of our second layer, followed by extracting phenotyped data in the third layer. We captured any project-specific requirements in the fourth layer. Results Using the implemented four-layer harmonisation method, we retrieved approximately 100 health-related variables for the 3.2 million individuals in Wales, which are harmonised with corresponding variables for > 56 million individuals in England. We processed 13 data sources into the first layer of our harmonisation method: five of these are updated daily or weekly, and the rest at various frequencies providing sufficient data flow updates for frequent capturing of up-to-date demographic, administrative and clinical information. Conclusions We implemented an efficient, transparent, scalable, and reproducible harmonisation method that enables multi-nation collaborative research. With a current focus on COVID-19 and its relationship with cardiovascular outcomes, the harmonised data has supported a wide range of research activities across the UK.
BackgroundBetter understanding of the role that children and school staff play in the transmission of SARS-CoV-2 is essential to guide policy development on controlling infection while minimising disruption to children’s education and well-being.MethodsOur national e-cohort (n=464531) study used anonymised linked data for pupils, staff and associated households linked via educational settings in Wales. We estimated the odds of testing positive for SARS-CoV-2 infection for staff and pupils over the period August– December 2020, dependent on measures of recent exposure to known cases linked to their educational settings.ResultsThe total number of cases in a school was not associated with a subsequent increase in the odds of testing positive (staff OR per case: 0.92, 95% CI 0.85 to 1.00; pupil OR per case: 0.98, 95% CI 0.93 to 1.02). Among pupils, the number of recent cases within the same year group was significantly associated with subsequent increased odds of testing positive (OR per case: 1.12, 95% CI 1.08 to 1.15). These effects were adjusted for a range of demographic covariates, and in particular any known cases within the same household, which had the strongest association with testing positive (staff OR: 39.86, 95% CI 35.01 to 45.38; pupil OR: 9.39, 95% CI 8.94 to 9.88).ConclusionsIn a national school cohort, the odds of staff testing positive for SARS-CoV-2 infection were not significantly increased in the 14-day period after case detection in the school. However, pupils were found to be at increased odds, following cases appearing within their own year group, where most of their contacts occur. Strong mitigation measures over the whole of the study period may have reduced wider spread within the school environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.