Analysis of structural variations
(SVs) is important to understand
mutations underlying genetic disorders and pathogenic conditions.
However, characterizing SVs using short-read, high-throughput sequencing
technology is difficult. Although long-read sequencing technologies
are being increasingly employed in characterizing SVs, their low throughput
and high costs discourage widespread adoption. Sequence motif-based
optical mapping in nanochannels is useful in whole-genome mapping
and SV detection, but it is not possible to precisely locate the breakpoints
or estimate the copy numbers. We present here a universal multicolor
mapping strategy in nanochannels combining conventional sequence-motif
labeling system with Cas9-mediated target-specific labeling of any
20-base sequences (20mers) to create custom labels and detect new
features. The sequence motifs are labeled with green fluorophores
and the 20mers are labeled with red fluorophores. Using this strategy,
it is possible to not only detect the SVs but also utilize custom
labels to interrogate the features not accessible to motif-labeling,
locate breakpoints, and precisely estimate copy numbers of genomic
repeats. We validated our approach by quantifying the D4Z4 copy numbers,
a known biomarker for facioscapulohumeral muscular dystrophy (FSHD)
and estimating the telomere length, a clinical biomarker for assessing
disease risk factors in aging-related diseases and malignant cancers.
We also demonstrate the application of our methodology in discovering
transposable long non-interspersed Elements 1 (LINE-1) insertions
across the whole genome.