2020
DOI: 10.1101/2020.03.15.992750
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Detecting sample swaps in diverse NGS data types using linkage disequilibrium

Abstract: As the number of genomics datasets grows rapidly, sample mislabeling has become a high stakes issue. We present CrosscheckFingerprints (Crosscheck), a tool for quantifying sample-relatedness and detecting incorrectly paired sequencing datasets from different donors. Crosscheck outperforms similar methods and is effective even when data are sparse or from different assays. Application of Crosscheck to 8851 ENCODE ChIP-, RNA-, and DNase-seq datasets enabled us to identify and correct dozens of mislabeled samples… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 26 publications
0
3
0
Order By: Relevance
“…For all samples, we excluded the possibility of sample mismatch by comparing germline variants identified in normal tissue to neoplasia samples of a given patient. The reads of each read group were extracted with SAMtools view using the options '-bh {input_bam} -r {read_group_id}', and GATK's CheckFingerprint tool was applied to extract statistics on sample-patient matches 75 . For virtually all high-purity samples without extensive loss of heterozygosity, we were able to confirm that the samples were obtained from the expected patient.…”
Section: Verification Of Sample-patient Matchesmentioning
confidence: 99%
“…For all samples, we excluded the possibility of sample mismatch by comparing germline variants identified in normal tissue to neoplasia samples of a given patient. The reads of each read group were extracted with SAMtools view using the options '-bh {input_bam} -r {read_group_id}', and GATK's CheckFingerprint tool was applied to extract statistics on sample-patient matches 75 . For virtually all high-purity samples without extensive loss of heterozygosity, we were able to confirm that the samples were obtained from the expected patient.…”
Section: Verification Of Sample-patient Matchesmentioning
confidence: 99%
“…For all samples we excluded the possibly of sample mismatch by comparing germline variants identified in normal tissue to neoplasia samples of a given patient. The reads of each read-group were extracted with samtools view using options ‘-bh {input_bam} -r {read_group_id}’ and GATK’s CheckFingerprint tool was applied to extract statistics on sample-patient matches (Javed et al, 2020). For virtually all high-purity samples without extensive loss of heterozygosity, we were able to confirm that the samples were obtained from the expected patient, for the latter group we inspected copy-number profiles (see below) to confirm that these matched the remaining samples.…”
Section: Methodsmentioning
confidence: 99%
“…For all samples we excluded the possibly of sample mismatch by comparing germline variants identified in normal tissue to neoplasia samples of a given patient. The reads of each read-group were extracted with samtools view using options '-bh {input_bam} -r {read_group_id}' and GATK's CheckFingerprint tool was applied to extract statistics on sample-patient matches 72 . For virtually all high-purity samples without extensive loss of heterozygosity, we were able to confirm that the samples were obtained from the expected patient, for the latter group we inspected copy-number profiles (see below) to confirm that these matched the remaining samples.…”
Section: Verification Of Sample-patient Matchesmentioning
confidence: 99%