We compared whole-exome sequencing (WES) and whole-genome sequencing (WGS) in six unrelated individuals. In the regions targeted by WES capture (81.5% of the consensus coding genome), the mean numbers of single-nucleotide variants (SNVs) and small insertions/deletions (indels) detected per sample were 84,192 and 13,325, respectively, for WES, and 84,968 and 12,702, respectively, for WGS. For both SNVs and indels, the distributions of coverage depth, genotype quality, and minor read ratio were more uniform for WGS than for WES. After filtering, a mean of 74,398 (95.3%) high-quality (HQ) SNVs and 9,033 (70.6%) HQ indels were called by both platforms. A mean of 105 coding HQ SNVs and 32 indels was identified exclusively by WES whereas 692 HQ SNVs and 105 indels were identified exclusively by WGS. We Sanger-sequenced a random selection of these exclusive variants. For SNVs, the proportion of false-positive variants was higher for WES (78%) than for WGS (17%). The estimated mean number of real coding SNVs (656 variants, ∼3% of all coding HQ SNVs) identified by WGS and missed by WES was greater than the number of SNVs identified by WES and missed by WGS (26 variants). For indels, the proportions of falsepositive variants were similar for WES (44%) and WGS (46%). Finally, WES was not reliable for the detection of copy-number variations, almost all of which extended beyond the targeted regions. Although currently more expensive, WGS is more powerful than WES for detecting potential disease-causing mutations within WES regions, particularly those due to SNVs.hole-exome sequencing (WES) is routinely used and is gradually being optimized for the detection of rare and common genetic variants in humans (1-8). However, wholegenome sequencing (WGS) is becoming increasingly attractive as an alternative, due to its broader coverage and decreasing cost (9-11). It remains difficult to interpret variants lying outside the protein-coding regions of the genome. Diagnostic and research laboratories, whether public or private, therefore tend to search for coding variants, most of which can be detected by WES, first. Such variants can also be detected by WGS, and several studies previously compared WES and WGS for different types of variations and/or in different contexts (9,(11)(12)(13)(14)(15)(16), but none of them in a really comprehensive manner. Here, we compared WES and WGS, in terms of detection rates and quality, for single-nucleotide variants (SNVs), small insertions/ deletions (indels), and copy-number variants (CNVs) within the regions of the human genome covered by WES, using the most recent next-generation sequencing (NGS) technologies. We aimed to identify the most efficient and reliable approach for identifying these variants in coding regions of the genome, to define the optimal analytical filters for decreasing the frequency of false-positive variants, and to characterize the genes that were either hard to sequence by either approach or were poorly covered by WES kits.
ResultsWe compared the two NGS techniques, perform...