2015
DOI: 10.1186/s12859-015-0801-z
|View full text |Cite
|
Sign up to set email alerts
|

An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome

Abstract: BackgroundSingle Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling — quality of the reference sequence, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
35
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 36 publications
(35 citation statements)
references
References 35 publications
0
35
0
Order By: Relevance
“… Ribeiro et al. (2015) explored the relationship between the choice of tools and parameters, and their impact on false positive variants.…”
Section: Discussionmentioning
confidence: 99%
“… Ribeiro et al. (2015) explored the relationship between the choice of tools and parameters, and their impact on false positive variants.…”
Section: Discussionmentioning
confidence: 99%
“…The consistent genetic structure (F IS and F ST ) inferred from both SNP sets appears remarkable provided that they differ by 33% of total SNPs comprised in TE sequences. The present data set suggests that current mapping software coupled with high-quality genome references may cope with a possible bias in the mapping in repetitive sequences (Ribeiro et al, 2015), even though marginal differences between the outcomes using all versus only non-TE SNPs remained in our study.…”
Section: Distribution Of Snps and Polymorphic Tes Within/among Popumentioning
confidence: 67%
“…The false positives in variant calling from the de novo assembly are largely due to mapping errors resulting from aligning to the consensus sequence rather than a complete genome. The reference quality is known to be critical along with stringent mapping, particularly with poorer, fragmented references [ 45 ]. While this paper focuses on high throughput short read sequences, there is new long read technology and observed improvements in reference quality due to combined assemblies from long and short reads [ 46 ].…”
Section: Discussionmentioning
confidence: 99%