2021
DOI: 10.1111/1755-0998.13384
|View full text |Cite
|
Sign up to set email alerts
|

Debar: A sequence‐by‐sequence denoiser for COI‐5P DNA barcode data

Abstract: DNA barcoding and metabarcoding are now widely used to advance species discovery and biodiversity assessments. High-throughput sequencing (HTS) has expanded the volume and scope of these analyses, but elevated error rates introduce noise into sequence records that can inflate estimates of biodiversity. Denoising -the separation of biological signal from instrument (technical) noise-of barcode and metabarcode data currently employs abundance-based methods which do not capitalize on the highly conserved structur… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 45 publications
(87 reference statements)
0
3
0
Order By: Relevance
“…Similar computational and statistical tools, in the form of MATLAB packages, R packages, Python packages and methodological pipelines, used to assess anomalies in DNA (meta)barcodes, have been released. Examples include divisive hierarchical clustering: DADA ( Rosen et al 2012 ) and DADA2 ( Callahan et al 2016 ); artificial neural networks: ( Ma et al 2018 ); Profile Hidden Markov Models: coil ( Nugent et al 2020 ), debar ( Nugent et al 2021 and Porter and Hajibabaei 2021 ); distribution sample quantiles: MACER ( Young et al 2021 ); and Shannon entropy: SequenceBouncer ( Dunn 2021 ), A2G2 ( Hleap et al 2020 ), DnoisE ( Antich et al 2022 and Turon et al 2020 ). These methods and programmes are beginning to see widespread use within the biodiversity and regulatory science communities.…”
Section: Discussionmentioning
confidence: 99%
“…Similar computational and statistical tools, in the form of MATLAB packages, R packages, Python packages and methodological pipelines, used to assess anomalies in DNA (meta)barcodes, have been released. Examples include divisive hierarchical clustering: DADA ( Rosen et al 2012 ) and DADA2 ( Callahan et al 2016 ); artificial neural networks: ( Ma et al 2018 ); Profile Hidden Markov Models: coil ( Nugent et al 2020 ), debar ( Nugent et al 2021 and Porter and Hajibabaei 2021 ); distribution sample quantiles: MACER ( Young et al 2021 ); and Shannon entropy: SequenceBouncer ( Dunn 2021 ), A2G2 ( Hleap et al 2020 ), DnoisE ( Antich et al 2022 and Turon et al 2020 ). These methods and programmes are beginning to see widespread use within the biodiversity and regulatory science communities.…”
Section: Discussionmentioning
confidence: 99%
“…Machine learning approaches also allow prediction of patterns of biodiversity at large geographical scales by facilitating the combination of genomic, ecological and geographical data in novel ways (Barrow et al, 2020). Finally, machine learning can improve our estimates of biodiversity by allowing efficient error correction of barcoding data sets (Nugent et al, 2021). Again, the diversity of ap-…”
Section: Biodiversity and Species Limitsmentioning
confidence: 99%
“…However, errors can result in inflated estimates of diversity when not corrected. To address this issue,Nugent et al (2021) introduce debar, an approach for denoising COI-5P DNA barcode data using machine learning. Debar uses a Profile Hidden Markov model (PHMM) to detect indel errors in COI barcoding data.…”
mentioning
confidence: 99%