2019
DOI: 10.1534/g3.119.400165
|View full text |Cite
|
Sign up to set email alerts
|

Cleaning Genotype Data from Diversity Outbred Mice

Abstract: Data cleaning is an important first step in most statistical analyses, including efforts to map the genetic loci that contribute to variation in quantitative traits. Here we illustrate approaches to quality control and cleaning of array-based genotyping data for multiparent populations (experimental crosses derived from more than two founder strains), using MegaMUGA array data from a set of 291 Diversity Outbred (DO) mice. Our approach employs data visualizations that can reveal problems at the level of indivi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 27 publications
(24 citation statements)
references
References 22 publications
0
24
0
Order By: Relevance
“…Further quality control was then performed 83 , which led to the removal of several hundred more markers that had greater than 5% genotyping errors, after which genotype and allele probabilities and kinship matrices were recalculated. After the aforementioned successive marker filtration, 109,427 markers remained, out of 143,259 initial genotyping markers.…”
Section: Methodsmentioning
confidence: 99%
“…Further quality control was then performed 83 , which led to the removal of several hundred more markers that had greater than 5% genotyping errors, after which genotype and allele probabilities and kinship matrices were recalculated. After the aforementioned successive marker filtration, 109,427 markers remained, out of 143,259 initial genotyping markers.…”
Section: Methodsmentioning
confidence: 99%
“…Genotyping by RNA-Seq software ( https://gbrs.readthedocs.io/en/latest/ ) was used to align the RNA-Seq reads and reconstruct the individual haplotypes of DO mice. GBRS-constructed haplotypes were cross-compared against MegaMUGA-constructed diplotypes as a confirmation step to identify and correct sample mix-ups ( Broman et al, 2019 ). We applied Expectation-Maximization algorithm for Allele Specific Expression (EMASE) ( Raghupathy et al, 2018 ) to quantify gene expression from the individual aligned RNA-seq data.…”
Section: Methodsmentioning
confidence: 99%
“…Further quality control was then performed 82 , which led to the removal of several hundred more markers that had greater than 5% genotyping errors, after which genotype and allele probabilities and kinship matrices were recalculated. After the aforementioned successive marker filtration, 109,427 markers remained, out of 143,259 initial genotyping markers.…”
Section: Methodsmentioning
confidence: 99%