2021
DOI: 10.1186/s13040-021-00259-6
|View full text |Cite
|
Sign up to set email alerts
|

Estimating sequencing error rates using families

Abstract: Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2

Relationship

5
1

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 38 publications
0
12
0
Order By: Relevance
“…In order to better understand patterns of contamination in human whole genome sequencing, we analyzed sequences from the iHART dataset 36 . Originally curated to study genetic determinants of autism, the iHART dataset contains whole genome sequences from blood samples from children with autism, their siblings, and their parents, but also stands as an invaluable genomics resource due to its unique family structure 37 , 38 . iHART was sequenced at the New York Genome Sequencing Center, a common site for large sequencing studies, using commonly followed storage, prep, and sequencing protocols 36 , making it a good model dataset to understand common sequencing issues.…”
Section: Introductionmentioning
confidence: 99%
“…In order to better understand patterns of contamination in human whole genome sequencing, we analyzed sequences from the iHART dataset 36 . Originally curated to study genetic determinants of autism, the iHART dataset contains whole genome sequences from blood samples from children with autism, their siblings, and their parents, but also stands as an invaluable genomics resource due to its unique family structure 37 , 38 . iHART was sequenced at the New York Genome Sequencing Center, a common site for large sequencing studies, using commonly followed storage, prep, and sequencing protocols 36 , making it a good model dataset to understand common sequencing issues.…”
Section: Introductionmentioning
confidence: 99%
“…In order to better understand patterns of contamination in human whole genome sequencing, we analyzed sequences from the iHART dataset[37]. Originally curated to study genetic determinants of autism, the iHART dataset contains whole genome sequences from blood samples from children with autism, their siblings, and their parents, but also stands as an invaluable genomics resource due to its unique family structure [38, 39, 40]. iHART was sequenced at the New York Genome Sequencing Center, a common site for large sequencing studies, using commonly followed storage, prep, and sequencing protocols [37], making it a good model dataset to understand common sequencing issues.…”
Section: Introductionmentioning
confidence: 99%
“…We use ASLAN with the iHART dataset [26], a large WGS dataset from families with autistic children (4,501 individuals from 1,010 families), that our group curated originally to study the genetic components of autism. To our knowledge, this is one of the largest familial WGS datasets in the world and offers an unique opportunity for family-based analysis such as ASLAN and others [25, 4, 6]. The iHART collection includes the raw WGS from the individuals, as well as the aligned and variant-called data (VCF format) in reference to GRCh38.…”
Section: Resultsmentioning
confidence: 99%