Genotype imputation using the reference panel is a cost-effective strategy to fill millions of missing genotypes for the purpose of various genetic analyses. Here, we present the Northeast Asian Reference Database (NARD), including whole-genome sequencing data of 1,781 individuals from Korea, Mongolia, Japan, China, and Hong Kong. NARD provides the genetic diversities of Korean (n=850) and Mongolian (n=386) ancestries that were not present in the 1000 Genomes Project Phase 3 (1KGP3). We combined and re-phased the genotypes from NARD and 1KGP3 to construct a union set of haplotypes. This approach established a robust imputation reference panel for the Northeast Asian populations, which yields the greatest imputation accuracy of rare and low-frequency variants compared with the existing panels. Also, we illustrate that NARD can potentially improve disease variant discovery by reducing pathogenic candidates. Overall, this study provides a decent reference panel for the genetic studies in Northeast Asia.During the past decade, whole-genome sequencing (WGS) of the reference populations has enabled the extensive human genetic researches to be carried out 1,2 . It has played an imperative role in the genetic researches, especially for genotype imputation in the genomewide association study (GWAS). With the recent expanding number of such studies, several research groups have generated the extensive WGS data to build reference panels 1-10 . The most commonly used imputation panels are constructed by the 1000 Genomes Project Phase 3 (1KGP3) and Haplotype Reference Consortium (HRC), which are publicly available for researchers. As genotype imputation increases the power of GWAS in a cost-efficient way, the confidence of imputed genotypes is important. To improve the quality of imputation in the genetic studies, large-scale population-specific panels with deep-coverage WGS need to be used 4,9 .Despite Northeast Asians account for 21.51% of worldwide population (see URLs), the majority of genetic studies and the reference panels have European bias 11 . There are only a few population-scale WGS studies covering Northeast Asians from China, Japan, and Mongolia 6,8,12,13 , and these studies have the several issues for the improved reference panel in Northeast Asia such as public unavailability, inadequate sequencing coverage, and small sample size. Furthermore, although Koreans (KOR) are one of the major population groups in Northeast Asia, previous datasets for KOR [14][15][16] does not have enough number of WGS samples to accurately impute the genome-wide variants of KOR population. Therefore, constructing a large-scale whole-genome reference panel for the diverse population groups in Northeast Asia with deep sequencing coverage is still necessary to allow dense and accurate genotype imputation for the genetic researches in these populations.In this study, we constructed the Northeast Asian Reference Database (NARD), consisting of 1,781 individuals from Korea, Japan, Mongolia, China, and Hong Kong. The goal of this study is t...