2019
DOI: 10.1101/664623
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A robust benchmark for germline structural variant detection

Abstract: New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. Translating these methods to routine research and clinical practice requires robust benchmark sets. We developed the first benchmark set for identification of both false negative and false positive germline SVs, which complements recent efforts emphasizing increasingly comprehensive characterization of SVs. To create this benchmark for a broadly c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
62
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 56 publications
(64 citation statements)
references
References 81 publications
2
62
0
Order By: Relevance
“…For SNVs, we benchmarked the calls from three strategies using the gold standard of NA12878 (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/GRCh38/) and NA24385 (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG002_NA24385_son/latest/GRCh38/). For SVs, we compared three linked-read sets (R 9 , R 10 , R 11 ) from HG002 with the Tier 1 SV benchmark from Genome in a Bottle [36] and used VaPoR [37] to validate our SV calls based on PacBio CCS reads from NA24385 (Highly-accurate long-read sequencing improves variant detection and assembly of a human genome). We compared SNV and SV calls among the different approaches using vcfeval [38] and truvari [36], respectively.…”
Section: Data Descriptionmentioning
confidence: 99%
“…For SNVs, we benchmarked the calls from three strategies using the gold standard of NA12878 (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/GRCh38/) and NA24385 (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/AshkenazimTrio/HG002_NA24385_son/latest/GRCh38/). For SVs, we compared three linked-read sets (R 9 , R 10 , R 11 ) from HG002 with the Tier 1 SV benchmark from Genome in a Bottle [36] and used VaPoR [37] to validate our SV calls based on PacBio CCS reads from NA24385 (Highly-accurate long-read sequencing improves variant detection and assembly of a human genome). We compared SNV and SV calls among the different approaches using vcfeval [38] and truvari [36], respectively.…”
Section: Data Descriptionmentioning
confidence: 99%
“…10 Because of the unique value gained by optical mapping of ultra-long DNA reads, it has been used in essentially all modern reference genome assemblies (human GRCh, 11; 12 mouse, 13 goat, 14 maize, 15 as well as benchmark structural variation papers. 16-18…”
Section: Introductionmentioning
confidence: 99%
“…We describe a machine learning model, BioGraph QUALclassifier, that uses coverage signatures to assign quality scores to discovered alleles to increase specificity and assist prioritization of variants. We benchmark five technical replicates of an individual (HG002) against the Genome in a Bottle Tier 1 SV set [ 21 ] alongside other SV detection pipelines to illustrate relative performance. Finally, we merge discovered calls and confirm the presence of alleles with BioGraph Coverage to assess the per-replicate increase in recall over discovery.…”
Section: Introductionmentioning
confidence: 99%