Although it is heavily relied on to study genetic contributors to health and disease, the current human reference genome (GRCh38) is incomplete in two major ways: firstly, it is missing large sections of heterochromatic sequence, and secondly, as a singular, linear reference genome it does not represent the full spectrum of genetic diversity that exists in the human species. In order to better understand and characterize gaps in GRCh38 and genetic diversity, we developed a method - ASLAN, an Algorithm for Sequence Location Approximation using Nuclear families - that identifies the region of origin of short reads that do not align to the GRCh38. Using unmapped reads and variant calls from whole genome sequencing (WGS) data from nuclear families, ASLAN relies on a maximum likelihood model to identify the most likely region of the genome that a subsequence belongs to, given the phasing information of family and the distribution of the subsequence in the unmapped reads. Validating ASLAN on a synthetically generated dataset, and on true reads originating from the alternative haplotypes in the decoy genome, we show that ASLAN can localize more than 90% of 100-basepair sequences with above 92% accuracy and around 1 megabase of resolution. We then run ASLAN on 100-mers from unmapped reads from WGS from over 700 families, and compare ASLAN localizations to alignment of the 100-mers to the T2T-CHM13 assembly, recently released by the Telomere-to-telomere (T2T) consortia. We find that many unmapped reads in GRCh38 originate from telomeres and centromeres that are gaps in the GRCh38 reference. We also confirm that ASLAN localizations are in high concordance with T2T-CHM13 alignments, except in the centromeres of the acrocentric chromosomes. Comparing ASLAN localizations and T2T-CHM13 alignments, we identify sequences missing from T2T-CHM13 or sequences with high divergence from their aligned region in T2T-CHM13, thus highlighting new hotspots for genetic diversity.