2019
DOI: 10.1016/j.ygeno.2018.05.004
|View full text |Cite
|
Sign up to set email alerts
|

Quality control and integration of genotypes from two calling pipelines for whole genome sequence data in the Alzheimer's disease sequencing project

Abstract: The Alzheimer's Disease Sequencing Project (ADSP) performed whole genome sequencing (WGS) of 584 subjects from 111 multiplex families at three sequencing centers. Genotype calling of single nucleotide variants (SNVs) and insertion-deletion variants (indels) was performed centrally using GATK-HaplotypeCaller and Atlas V2. The ADSP Quality Control (QC) Working Group applied QC protocols to project-level variant call format files (VCFs) from each pipeline, and developed and implemented a novel protocol, termed "c… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
50
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 36 publications
(53 citation statements)
references
References 29 publications
3
50
0
Order By: Relevance
“…Several WES QC pipelines have been described 9,10 , which use the Genome Analysis Tool Kit (GATK) Variant Quality Score Recalibration (VQSR) approach as their backbones while enhancing GATK’s output by utilizing various hard filters to further screen data based on specific QC metrics. However, no objectively evaluated WGS QC pipeline had been developed until very recently 11 , and this pipeline did not utilize duplicate samples in determining QC filter thresholds or to prioritize filters based on efficacy, and it only considered biallelic variants. WGS studies typically use at least one hard filter based on output parameters from variant calling, but the exact filters and threshold values employed are often arbitrary or not empirically determined 1214 .…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Several WES QC pipelines have been described 9,10 , which use the Genome Analysis Tool Kit (GATK) Variant Quality Score Recalibration (VQSR) approach as their backbones while enhancing GATK’s output by utilizing various hard filters to further screen data based on specific QC metrics. However, no objectively evaluated WGS QC pipeline had been developed until very recently 11 , and this pipeline did not utilize duplicate samples in determining QC filter thresholds or to prioritize filters based on efficacy, and it only considered biallelic variants. WGS studies typically use at least one hard filter based on output parameters from variant calling, but the exact filters and threshold values employed are often arbitrary or not empirically determined 1214 .…”
Section: Introductionmentioning
confidence: 99%
“…WGS studies typically use at least one hard filter based on output parameters from variant calling, but the exact filters and threshold values employed are often arbitrary or not empirically determined 1214 . In previous studies, multiallelic (non-biallelic) variants were systematically removed in QC steps prior to downstream analysis 11,15,16 , as they were broadly deemed low in quality. However, as sample sizes in sequencing studies increase 10,17,18 , the prevalence of multiallelic variants rises 19 .…”
Section: Introductionmentioning
confidence: 99%
“…As a proof of principle, we applied our method to the Alzheimer’s Disease Sequencing Project (ADSP) case-control data [49] to approximate the number of potential mutations our approach could rescue. The ADSP is a large sequencing project organized, in part, to identify functional mutations that influence Alzheimer’s disease development.…”
Section: Resultsmentioning
confidence: 99%
“…Median genome-wide read depths ranged from 35.4x to 42.9x, with a median of 39.4x. Samples were prepared and sequenced as part of the ADSP [49]. These samples were aligned using BWA (vO.5.9).…”
Section: Methodsmentioning
confidence: 99%
“…The majority of rare variant detection studies utilise, at least in part, WES. Assembly of WES data within an experiment is now relatively standardised, with specifically designed software, quality control and analysis pipelines [74]. However, as evidenced from meta-analysis of GWAS [7,75], much of the power of genetic analysis of complex traits is gained through combination of data from multiple independent experiments.…”
Section: Challenges Of Rare Variant Identificationmentioning
confidence: 99%