2020
DOI: 10.1101/2020.09.23.310110
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Samplot: A Platform for Structural Variant Visual Validation and Automated Filtering

Abstract: Visual validation is an essential step in structural variant (SV) detection to eliminate false positives. We present Samplot, a tool for quickly creating images that display the read depth and sequence alignments necessary to adjudicate purported SVs across multiple samples and sequencing technologies, including short, long, and phased reads. These simple images can be rapidly reviewed to curate large SV call sets. Samplot is easily applicable to many biological problems such as prioritization of potentially c… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3

Relationship

3
3

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 29 publications
0
13
0
Order By: Relevance
“…We then visually scrutinized the evidence in the parents and grandparents of each thirdgeneration CEPH sample with a dnSV call. We examined IGV and Samplot (Belyeu et al 2020) images that included the offspring sample, both parents, and both sets of grandparents and carefully examined each for any missed split read, discordant pair, or coverage depth signals indicating that the putative dnSV was actually a missed transmission event. This provided an extra opportunity to detect elusive inherited variants.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We then visually scrutinized the evidence in the parents and grandparents of each thirdgeneration CEPH sample with a dnSV call. We examined IGV and Samplot (Belyeu et al 2020) images that included the offspring sample, both parents, and both sets of grandparents and carefully examined each for any missed split read, discordant pair, or coverage depth signals indicating that the putative dnSV was actually a missed transmission event. This provided an extra opportunity to detect elusive inherited variants.…”
Section: Resultsmentioning
confidence: 99%
“…Given the lack of validated training set for CEPH and the much smaller sample size, only a simple filter of less than 30 for depth-based GQ and PE/SR GQ in parents was applied. All variants that passed these filters were manually reviewed using duphold (Pedersen and Quinlan 2019), IGV (Robinson et al 2011), SV-Plaudit (Belyeu et al 2018) with Samplot (Belyeu et al 2020), and an internal R based visualization script found in GATK-SV. In order to reduce the chance of missing a variant of interest, all private variants with a passing parental GQ are included in the manual investigation.…”
Section: Methodsmentioning
confidence: 99%
“…Next, we exclude CNVs in telomeric regions, CNVs found in only one timepoint, and CNVs that are less than four windows (2 kb) long. Finally, we manually inspect our structural variant and CNV calls together using a modified version of Samplot ( Belyeu et al, 2020 ) to create a list of confirmed copy number variants in our populations ( Supplementary file 3 ). During this analysis, we noticed two regions with high copy-number that experienced copy-number changes in many populations: one associated with the CUP1 tandem array and one associated with the ribosomal DNA tandem array.…”
Section: Methodsmentioning
confidence: 99%
“…We filtered calls that overlapped the centromere, assembly gaps, or sequence coverage (PacBio CLR lra alignments) greater than twice the average coverage. Each resulting callset was manually curated using dot-plots of the genome assemblies or HiFi reads against the reference, and using samplot (Belyeu et al , 2020). Calls were classified as true positives if at least one haplotype or an exemplary HiFi read clearly showed the inversion or it was indicated from the samplot alignments, a duplication if the dot-plot signature was a fixed inverted duplication or inverted transposition, false-positive if both haplotypes did not show variation or an inverted duplication structure, and NA if it was not clear, commonly in pericentric regions.…”
Section: Resultsmentioning
confidence: 99%