2020
DOI: 10.1038/s41598-020-77218-4
|View full text |Cite
|
Sign up to set email alerts
|

Accuracy and efficiency of germline variant calling pipelines for human genome data

Abstract: Advances in next-generation sequencing technology have enabled whole genome sequencing (WGS) to be widely used for identification of causal variants in a spectrum of genetic-related disorders, and provided new insight into how genetic polymorphisms affect disease phenotypes. The development of different bioinformatics pipelines has continuously improved the variant analysis of WGS data. However, there is a necessity for a systematic performance comparison of these pipelines to provide guidance on the applicati… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
64
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 75 publications
(66 citation statements)
references
References 41 publications
1
64
0
1
Order By: Relevance
“…As a demonstration we chose Agilent hybrid capture panel + Illumina DRAGEN somatic small variant caller. For our ultra-large dataset (> 1000 × WES) without matched normal samples, Illumina DRAGEN bioinformatics pipeline was chosen based on high accuracy, high efficiency, and for not needing matched normal samples 29 , 30 . A recent benchmark study showed that across 5 datasets, DRAGEN produces 14–67% and 22–91% fewer false SNV calls, 35–86% and 48–89% fewer false indel calls, at 75% and 830% faster speed than Strelka2 and Mutect2 respectively 24 .…”
Section: Discussionmentioning
confidence: 99%
“…As a demonstration we chose Agilent hybrid capture panel + Illumina DRAGEN somatic small variant caller. For our ultra-large dataset (> 1000 × WES) without matched normal samples, Illumina DRAGEN bioinformatics pipeline was chosen based on high accuracy, high efficiency, and for not needing matched normal samples 29 , 30 . A recent benchmark study showed that across 5 datasets, DRAGEN produces 14–67% and 22–91% fewer false SNV calls, 35–86% and 48–89% fewer false indel calls, at 75% and 830% faster speed than Strelka2 and Mutect2 respectively 24 .…”
Section: Discussionmentioning
confidence: 99%
“…For example, sequencing pipelines, typically using algorithms running on high-end computer clusters on general CPUs, are being integrated in processors themselves (i.e., ‘hard-coded’). The reconfigurable DRAGEN Bio-IT Processor, produced by Illumina [ 56 ], has hard-coded highly optimized algorithms for the full next generation sequencing (NGS) secondary analysis pipeline, which set world speed records for genomic data analysis [ 57 ].…”
Section: Progress Of Genomic Technologies Enabling Personalized Medicinementioning
confidence: 99%
“…This file contains biological sequences aligned to a reference sequence in a machine readable format. Methods like Samtools [3] and The Genome Analysis Toolkit (GATK) are recognized as being the most accurate for more advanced Bioinformatics analysis [4][5] such as variant calling (BAM to VCF), to look for a singlenucleotide polymorphism (SNP) "genetic mutations". The FreeBayes variant detector gives by far the best compromise in speed and sensitivity [6][7].…”
Section: Introductionmentioning
confidence: 99%