2017
DOI: 10.3897/mycokeys.26.14591
|View full text |Cite
|
Sign up to set email alerts
|

Read quality-based trimming of the distal ends of public fungal DNA sequences is nowhere near satisfactory

Abstract: DNA sequences are increasingly used for taxonomic and functional assessment of environmental communities. In mycology, the nuclear ribosomal internal transcribed spacer (ITS) region is the most commonly chosen marker for such pursuits. Molecular identification is associated with many challenges, one of which is low read quality of the reference sequences used for inference of taxonomic and functional properties of the newly sequenced community (or single taxon). This study investigates whether public fungal IT… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 34 publications
0
10
0
Order By: Relevance
“…BLAST identity considers individual indels as mismatches and hence results in lower similarity values than the other two approaches for a given sequence pair. It is also more sensitive to homopolymer-based sequencing errors in the query reads and affected by improper trimming of low-quality terminal portions of reference sequences (Nilsson et al 2017 ). As a result, sequences retrieved as best hits in BLAST searches are not necessarily most closely related (e.g.…”
Section: Pairwise Similarity Assessments: Limitations and Solutionsmentioning
confidence: 99%
“…BLAST identity considers individual indels as mismatches and hence results in lower similarity values than the other two approaches for a given sequence pair. It is also more sensitive to homopolymer-based sequencing errors in the query reads and affected by improper trimming of low-quality terminal portions of reference sequences (Nilsson et al 2017 ). As a result, sequences retrieved as best hits in BLAST searches are not necessarily most closely related (e.g.…”
Section: Pairwise Similarity Assessments: Limitations and Solutionsmentioning
confidence: 99%
“…A major concern is the reliability of the DNA sequence data ( Bridge et al 2003 , Nilsson et al 2006 ). PCR or cloning errors (including the introduction of chimeras), DNA degradation, and post-processing of chromatograms, have been shown to be a source of sequence variation in at least some groups ( Haas et al 2011 , Sandoval-Sierra et al 2014 , Hughes et al 2015 , Strid et al 2015 , Aas et al 2017 , Nilsson et al 2017 , Thielecke et al 2017 , Bieker & Martin 2018 ). Such DNA sequences are not real and cannot be checked or corrected without access to a physical specimen or, as a minimum, access to the raw sequence reads ( Tripp & Lendemer 2014 ).…”
Section: Reliability and Extent Of Datamentioning
confidence: 99%
“…The generated data should be deposited to sequence repositories just as carefully and as richly annotated as in the case of Sanger sequences to avoid errors derived from the experimental procedure (cf. Nilsson et al, ). The sequencing error rate, determined through the use of known high‐quality reference data, should always be included in the submission, included either in the sequence header or as additional data.…”
Section: Discussionmentioning
confidence: 99%