Objectives
Allele counts of sequence variants obtained by whole genome sequencing (WGS) often play a central role in interpreting the results of genetic and genomic research. However, such variant counts are not readily available for individuals in the Danish population. Here, we present a dataset with allele counts for sequence variants (single nucleotide variants (SNVs) and indels) identified from WGS of 8,671 (5,418 females) individuals from the Danish population. The data resource is based on WGS data from three independent research projects aimed at assessing genetic risk factors for cardiovascular, psychiatric, and headache disorders. To enable the sharing of information on sequence variation in Danish individuals, we created summarized statistics on allele counts from anonymized data and made them available through the European Genome-phenome Archive (EGA, https://identifiers.org/ega.dataset:EGAD00001009756) and in a dedicated browser, DanMAC5 (available at www.danmac5.dk). The summary level data and the DanMAC5 browser provide insight into the allelic spectrum of sequence variants segregating in the Danish population, which is important in variant interpretation.
Data description
Three WGS datasets with an average coverage of 30x were processed independently using the same quality control pipeline. Subsequently, we summarized, filtered, and merged allele counts to create a high-quality summary level dataset of sequence variants.