Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e−8 and 1.5e−9 per nucleotide per generation for SNVs and indels, respectively.
Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits 1-4 . Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly 2,5-7 . However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology 4,8-13 . We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark.Using a combination of high-depth (average 78× ) Illumina pairedend and mate-pair libraries, we applied Allpaths-LG 14 to create de novo assemblies of high quality and coverage for each of the 150 individuals with a median scaffold N50 of ~ 21 megabases (Mb; maximum ~ 30 Mb) (Supplementary Table 1). The 100 largest scaffolds in each of the 140 best assemblies typically covered more than 75% (median 77%, Extended Data Fig. 1a) of the genome, with the largest scaffolds exceeding 110 Mb in size (Supplementary Table 1). To evaluate the accuracy of the assemblies, we subsequently aligned the scaffolds for each individual to the human reference genome (GRCh38) 15 . Figure 1 shows an example individual where the euchromatic part of each chromosome was almost completely covered by a few large scaffolds and in several cases scaffolds covered almost entire chromosome arms. Only rarely did we find that large scaffolds break and align to more than one chromosome (Extended Data Fig. 1b), suggesting that even the largest scaffolds are seldom chimaeric. We also compared our de novo assemblies with a published long-read assembly based on BioNano mapping and PacBio sequencing 16 . Extended Data Figs 2a and 3 show that this assembly was less complete than our assemblies, but with similar scaffold lengths. The long-read assembly had 5.38% missing data compared with our median of 4.25% (Extended Data Fig. 3a), but the missing data in our assemblies were found in smaller gaps (Extended Data Fig. 3b, c), and the median contig length was therefore much smaller th...
Comparative genome analysis of strains of a pathogenic bacterial species can be a powerful tool to discover acquisition of mobile genetic elements related to virulence. Here, we compared 28 V. anguillarum strains that differed in virulence in fish larval models. By pan-genome analyses, we found that six of nine highly virulent strains had a unique core and accessory genome. In contrast, V. anguillarum strains that were medium to nonvirulent had low genomic diversity. Integration of genomic and phenotypic features provides insights into the evolution of V. anguillarum and can also be important for survey and diagnostic purposes.
Anomalies of eye development can lead to the rare eye malformations microphthalmia and anophthalmia (small or absent ocular globes), which are genetically very heterogeneous. Several genes have been associated with microphthalmia and anophthalmia, and exome sequencing has contributed to the identification of new genes. Very recently, homozygous variations within ALDH1A3 have been associated with autosomal recessive microphthalmia with or without cysts or coloboma, and with variable subphenotypes of developmental delay/autism spectrum disorder in eight families. In a consanguineous family where three of the five siblings were affected with microphthalmia/coloboma, we identified a novel homozygous missense mutation in ALDH1A3 using exome sequencing. Of the three affected siblings, one had intellectual disability and one had intellectual disability and autism, while the last one presented with normal development. This study contributes further to the description of the clinical spectrum associated with ALDH1A3 mutations, and illustrates the interfamilial clinical variation observed in individuals with ALDH1A3 mutations.
Background: The BREATHE study is a cross-sectional study of real-life patients with asthma and/ or COPD in Denmark and Sweden aiming to increase the knowledge across severities and combinations of obstructive airway disease. Design: Patients with suspicion of asthma and/or COPD and healthy controls were invited to participate in the study and had a standard evaluation performed consisting of questionnaires, physical examination, FeNO and lung function, mannitol provocation test, allergy test, and collection of sputum and blood samples. A subgroup of patients and healthy controls had a bronchoscopy performed with a collection of airway samples. Results: The study population consisted of 1403 patients with obstructive airway disease (859 with asthma, 271 with COPD, 126 with concurrent asthma and COPD, 147 with other), and 89 healthy controls (smokers and non-smokers). Of patients with asthma, 54% had moderate-tosevere disease and 46% had mild disease. In patients with COPD, 82% had groups A and B, whereas 18% had groups C and D classified disease. Patients with asthma more frequently had childhood asthma, atopic dermatitis, and allergic rhinitis, compared to patients with COPD, asthma + COPD and Other, whereas FeNO levels were higher in patients with asthma and asthma + COPD compared to COPD and Other (18 ppb and 16 ppb vs 12.5 ppb and 14 ppb, p < 0.001). Patients with asthma, asthma + COPD and Other had higher sputum eosinophilia (1.5%, 1.5%, 1.2% vs 0.75%, respectively, p < 0.001) but lower sputum neutrophilia (39.3, 43.5%, 40.8% vs 66.8%, p < 0.001) compared to patients with COPD. Conclusions: The BREATHE study provides a unique database and biobank with clinical information and samples from 1403 real-life patients with asthma, COPD, and overlap representing different severities of the diseases. This research platform is highly relevant for disease phenotype-and biomarker studies aiming to describe a broad spectrum of obstructive airway diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.