2021
DOI: 10.1101/2021.01.04.425285
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

debar, a sequence-by-sequence denoiser for COI-5P DNA barcode data

Abstract: DNA barcoding and metabarcoding are now widely used to advance species discovery and biodiversity assessments. High-throughput sequencing (HTS) has expanded the volume and scope of these analyses, but elevated error rates introduce noise into sequence records that can inflate estimates of biodiversity. Denoising —the separation of biological signal from instrument (technical) noise—of barcode and metabarcode data currently employs abundance-based methods which do not capitalize on the highly conserved structur… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…Our pseudogene removal approach was most effective on datasets of the full length COI barcode sequence region but is less effective for shorter sequences (~ 300 bp). Now that newer sequencing technologies such as LoopSeq, compatible with Illumina sequencing platforms but currently only available for RNA genes, or HiFi circular consensus sequencing (PacBio), it may one day be possible for COI metabarcoding to target the full length of the barcoding region to facilitate more efficient nuMT detection [39,[66][67][68]. It would also be helpful if DNA barcode studies reported and deposited full length verified pseudogenes into public databases when possible.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our pseudogene removal approach was most effective on datasets of the full length COI barcode sequence region but is less effective for shorter sequences (~ 300 bp). Now that newer sequencing technologies such as LoopSeq, compatible with Illumina sequencing platforms but currently only available for RNA genes, or HiFi circular consensus sequencing (PacBio), it may one day be possible for COI metabarcoding to target the full length of the barcoding region to facilitate more efficient nuMT detection [39,[66][67][68]. It would also be helpful if DNA barcode studies reported and deposited full length verified pseudogenes into public databases when possible.…”
Section: Discussionmentioning
confidence: 99%
“…For example, COI marker analysis need not be limited to operational taxonomic units (OTUs), but may also include the use of exact sequence variant (ESV) analysis for improved taxonomic resolution and permit intraspecific phylogeographic analyses [34][35][36][37]. Bioinformatic tools to remove sequence artefacts and noise specifically from COI datasets have also become available [38][39][40]. COI nuMTs have been discussed in the literature largely with regards to COI barcoding efforts [18,19,41] and only recently have tools appropriate for screening nuMTs from large batches of COI sequences become available [42].…”
Section: Introductionmentioning
confidence: 99%
“…Our pseudogenes removal approach was most effective on datasets of the full length COI barcode sequence region but is less effective for shorter sequences (∼ 300 bp). This is especially relevant now that newer sequencing technologies such as LoopSeq (compatible with Illumina sequencing platforms, but currently only available for RNA genes) or HiFi circular consensus sequencing (PacBio) could one day be used for COI metabarcoding targeting the full length of the barcoding region facilitating pseudogene detection [12, 7779]. It would also be helpful if COI barcode studies reported and deposited full length verified pseudogenes into public databases when possible.…”
Section: Discussionmentioning
confidence: 99%