2018
DOI: 10.1371/journal.pone.0204912
|View full text |Cite
|
Sign up to set email alerts
|

Whole-exome sequencing capture kit biases yield false negative mutation calls in TCGA cohorts

Abstract: The Cancer Genome Atlas (TCGA) provides a genetic characterization of more than ten thousand tumors, enabling the discovery of novel driver mutations, molecular subtypes, and enticing drug targets across many histologies. Here we investigated why some mutations are common in particular cancer types but absent in others. As an example, we observed that the gene CCDC168 has no mutations in the stomach adenocarcinoma (STAD) cohort despite its common presence in other tumor types. Surprisingly, we found that the l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
25
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 27 publications
(25 citation statements)
references
References 29 publications
0
25
0
Order By: Relevance
“…We investigated the potential amount of bias in individual VAF estimates that may be associated with each kit and may be a unique characteristic with certain variants. Bias in VAF can arise from several factors: bait-hybridization bias, mapping errors and bias in mapping towards the reference genome [ 36 ], and confusion in calling a multiple nucleotide polymorphism or a complex multi-allelic polymorphism. To separate mapping and calling issues from bait biases, we ran several individual pipelines including a compendium method (SomaticSeq) for each kit of pooled Sample A replicates as well as in silico Sample A and merged-BAM Sample A (see “ Methods ” for details).…”
Section: Resultsmentioning
confidence: 99%
“…We investigated the potential amount of bias in individual VAF estimates that may be associated with each kit and may be a unique characteristic with certain variants. Bias in VAF can arise from several factors: bait-hybridization bias, mapping errors and bias in mapping towards the reference genome [ 36 ], and confusion in calling a multiple nucleotide polymorphism or a complex multi-allelic polymorphism. To separate mapping and calling issues from bait biases, we ran several individual pipelines including a compendium method (SomaticSeq) for each kit of pooled Sample A replicates as well as in silico Sample A and merged-BAM Sample A (see “ Methods ” for details).…”
Section: Resultsmentioning
confidence: 99%
“…If not adequately addressed in the analysis, batch effects reduce statistical power and lead to both false-positive and false-negative associations. Practices in large sequencing studies that commonly introduce batch effects include dividing samples among multiple sequencing centers [2,3], collecting or preparing samples under different protocols [4], and extracting exomes using different target capture kits [4,5]. For example, the Alzheimer's Disease Sequencing Project (ADSP) sequenced exomes of more than 10,000 cases and controls to identify genetic factors associated with Alzheimer's disease (AD) [6].…”
Section: Introductionmentioning
confidence: 99%
“…It is well known that the efficiency of capture depends on several experimental procedures as well as on probe design, which may directly affect sequence depth and uniformity (Do et al, 2012;Chandler et al, 2016). Therefore, problems in the capturing reaction directly affect the final experiment results, yielding not only regions with different average depths but also leading to regions with no coverage at all (Lionel et al, 2018;Wang et al, 2018). We demonstrated here that differences, most likely attributed to the different methods used by the sequencing centers, proved to play a significant role in determining the distribution of sequencing depth in WES data from the 1000 Genomes Project.…”
Section: Discussionmentioning
confidence: 99%