2013
DOI: 10.1002/0471250953.bi1110s43
|View full text |Cite
|
Sign up to set email alerts
|

From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline

Abstract: This unit describes how to use BWA and the Genome Analysis Toolkit (GATK) to map genome sequencing data to a reference and produce high‐quality variant calls that can be used in downstream analyses. The complete workflow includes the core NGS data‐processing steps that are necessary to make the raw data suitable for analysis by the GATK, as well as the key methods involved in variant discovery using the GATK. Curr. Protoc. Bioinform. 43:11.10.1‐11.10.33. © 2013 by John Wiley & Sons, Inc.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

4
2,792
0
9

Year Published

2015
2015
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 5,119 publications
(2,973 citation statements)
references
References 13 publications
4
2,792
0
9
Order By: Relevance
“…(2017), single nucleotide polymorphisms (SNPs) were identified in the nuclear genome using three different variant callers: SAMTOOLS 1.1 (Li, 2011), FREEBAYES 0.9.15 (Garrison & Marth, 2012), and GATK 3.1 (DePristo et al., 2011; McKenna et al., 2010; Van der Auwera et al., 2013). A SNP set referred to as “consensus SNPs” hereafter was obtained from the overlap between callers, removing SNPs with a quality score below 50 and SNPs with mapping quality below 30.…”
Section: Methodsmentioning
confidence: 99%
“…(2017), single nucleotide polymorphisms (SNPs) were identified in the nuclear genome using three different variant callers: SAMTOOLS 1.1 (Li, 2011), FREEBAYES 0.9.15 (Garrison & Marth, 2012), and GATK 3.1 (DePristo et al., 2011; McKenna et al., 2010; Van der Auwera et al., 2013). A SNP set referred to as “consensus SNPs” hereafter was obtained from the overlap between callers, removing SNPs with a quality score below 50 and SNPs with mapping quality below 30.…”
Section: Methodsmentioning
confidence: 99%
“…Due to the long reads, high error rates and continuously evolving error profile of the MinION basecalls at this early stage of technology roll out, variant callers such as the Genome Analysis Toolkit’s UnifiedGenotyper or HaplotypeCaller 13 were unable to identify variants or haplotypes in the MinION sequence data during our trials. Variant and haplotype level information, however, was readily accessible based on coverage of aligned reads, which we extracted using SAMTools via the Pysam wrapper ( http://github.com/pysam-developers/pysam) 14 .…”
Section: Methodsmentioning
confidence: 99%
“…This choice will determine the appropriate type of assembly program to use (e.g., GATK: McKenna et al., 2010; dePristo et al., 2011; Van der Auwera et al., 2013 with a reference genome; Stacks: Catchen, Amores, Hohenlohe, Cresko, & Postlethwait, 2011; Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013; Paris, Stevens, & Catchen, 2017; or dDocent: Puritz, Hollenbeck, & Gold, 2014 for a de novo assembly). Using a high‐quality and well‐annotated reference genome facilitates the identification of candidate genes and gene regions and allows for a truly genomic approach (e.g., considering physical linkage between regions with adaptive variation; Manel et al., 2016).…”
Section: Design and Implement: Assessmentmentioning
confidence: 99%
“…A major decision that will determine which loci are included in the dataset is choosing the parameters determining how closely the sequences must match (either match the reference sequence or match other sequences in de novo approaches; Catchen et al., 2011; McKenna et al., 2010; dePristo et al., 2011; Van der Auwera et al., 2013) and how often the sequences occur in individuals (i.e., coverage). If the sensitivity of these parameters is too low, sequences will be combined that are not from the same genomic region (i.e., paralogs; McKinney, Waples, Seeb, & Seeb, 2017).…”
Section: Design and Implement: Assessmentmentioning
confidence: 99%
See 1 more Smart Citation