Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry 1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific 4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may Reprints and permissions information is available at http://www.nature.com/reprints.
The nematode Caenorhabditis elegans is an important model for studies of germ cell biology, including the meiotic cell cycle, gamete specification as sperm or oocyte, and gamete development. Fundamental to those studies is a genome-level knowledge of the germline transcriptome. Here, we use RNA-Seq to identify genes expressed in isolated XX gonads, which are approximately 95% germline and 5% somatic gonadal tissue. We generate data from mutants making either sperm [fem-3(q96)] or oocytes [fog-2(q71)], both grown at 22°. Our dataset identifies a total of 10,754 mRNAs in the polyadenylated transcriptome of XX gonads, with 2748 enriched in spermatogenic gonads, 1732 enriched in oogenic gonads, and the remaining 6274 not enriched in either. These spermatogenic, oogenic, and gender-neutral gene datasets compare well with those of previous studies, but double the number of genes identified. A comparison of the additional genes found in our study with in situ hybridization patterns in the Kohara database suggests that most are expressed in the germline. We also query our RNA-Seq data for differential exon usage and find 351 mRNAs with sex-enriched isoforms. We suggest that this new dataset will prove useful for studies focusing on C. elegans germ cell biology.
Cardiometabolic diseases are an increasing global health burden. While socioeconomic, environmental, behavioural, and genetic risk factors have been identified, a better understanding of the underlying mechanisms is required to develop more effective interventions. Magnetic resonance imaging (MRI) has been used to assess organ health, but biobank-scale studies are still in their infancy. Using over 38,000 abdominal MRI scans in the UK Biobank, we used deep learning to quantify volume, fat, and iron in seven organs and tissues, and demonstrate that imaging-derived phenotypes reflect health status. We show that these traits have a substantial heritable component (8–44%) and identify 93 independent genome-wide significant associations, including four associations with liver traits that have not previously been reported. Our work demonstrates the tractability of deep learning to systematically quantify health parameters from high-throughput MRI across a range of organs and tissues, and use the largest-ever study of its kind to generate new insights into the genetic architecture of these traits.
Highlights d Genomic data linked to health records capture demography in health systems d Genetic networks reveal recent common ancestry in diverse populations d Evidence of many founder populations in New York City d Fine-scale population structure impacts genetic risk predictions
The Clinical Genome Resource (ClinGen) Ancestry and Diversity Working Group highlights the need to develop guidance on race, ethnicity, and ancestry (REA) data collection and use in clinical genomics. We present quantitative and qualitative evidence to characterize: 1) acquisition of REA data via clinical laboratory requisition forms, and 2) information disparity across populations in the Genome Aggregation Database (gnomAD) at clinically relevant sites ascertained from annotations in ClinVar. Our requisition form analysis showed substantial heterogeneity in clinical laboratory ascertainment of REA, as well as marked incongruity among terms used to define REA categories. There was also striking disparity across REA populations in the amount of information available about clinically relevant variants in gnomAD. European ancestral populations constituted the majority of observations (55.8%), allele counts (59.7%), and private alleles (56.1%) in gnomAD at 550 loci with “pathogenic” and “likely pathogenic” expert-reviewed variants in ClinVar. Our findings highlight the importance of implementing and supporting programs to increase diversity in genome sequencing and clinical genomics, as well as measuring uncertainty around population-level datasets that are used in variant interpretation. Finally, we suggest the need for a standardized REA data collection framework to be developed through partnerships and collaborations and adopted across clinical genomics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.