Observational data on COVID-19 including hypothesised risk factors for infection and progression are accruing rapidly, often from non-random sampling such as hospital admissions, targeted testing or voluntary participation. Here, we highlight the challenge of interpreting observational evidence from such samples of the population, which may be affected by collider bias. We illustrate these issues using data from the UK Biobank in which individuals tested for COVID-19 are highly selected for a wide range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. We discuss the sampling mechanisms that leave aetiological studies of COVID-19 infection and progression particularly susceptible to collider bias. We also describe several tools and strategies that could help mitigate the effects of collider bias in extant studies of COVID-19 and make available a web app for performing sensitivity analyses. While bias due to non-random sampling should be explored in existing studies, the optimal way to mitigate the problem is to use appropriate sampling strategies at the study design stage.
Estimates from genome-wide association studies (GWAS) represent a combination of the effect of inherited genetic variation (direct effects), demography (population stratification, assortative mating) and genetic nurture from relatives (indirect genetic effects). GWAS using family-based designs can control for demography and indirect genetic effects, but large-scale family datasets have been lacking. We combined data on 159,701 siblings from 17 cohorts to generate population (between-family) and within-sibship (within-family) estimates of genome-wide genetic associations for 25 phenotypes. We demonstrate that existing GWAS associations for height, educational attainment, smoking, depressive symptoms, age at first birth and cognitive ability overestimate direct effects. We show that estimates of SNP-heritability, genetic correlations and Mendelian randomization involving these phenotypes substantially differ when calculated using within-sibship estimates. For example, genetic correlations between educational attainment and height largely disappear. In contrast, analyses of most clinical phenotypes (e.g. LDL-cholesterol) were generally consistent between population and within-sibship models. We also report compelling evidence of polygenic adaptation on taller human height using within-sibship data. Large-scale family datasets provide new opportunities to quantify direct effects of genetic variation on human traits and diseases.
The shape and appearance of the optic nerve head region are sensitive to changes associated with glaucoma and diabetes that may be otherwise asymptomatic. The changes can be diagnostic of the diseases, and tracking of the changes in sequential images can be used to assess treatment and the progress of the illness. At present, change detection and tracking are performed manually, which can be a cause of poor repeatability. We are concerned with developing automated techniques of generating quantitative descriptions of the retinal images that might be used in diagnosis and assessment. In this paper, we investigate the use of images that have been collected and stored remotely, as this will replicate capture and automated processing by outreach clinics. Normal and abnormal images were collected from a range of sources, to simulate the mass screening process. The images were processed using simple signal-processing methods and divided into two groups. Using a chi-squared test, the separation of normal and abnormal images using this test was found to be highly significant (p < 0.05, n = 60).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.