2021
DOI: 10.1101/2021.05.01.442223
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Linked machine learning classifiers improve species classification of fungi when using error-prone long-reads on extended metabarcodes

Abstract: The increased usage of long-read sequencing for metabarcoding has not been matched with public databases suited for error-prone long-reads. We address this gap and present a proof-of-concept study for classifying fungal species using linked machine learning classifiers. We demonstrate its capability for accurate classification using labelled and unlabelled fungal sequencing datasets. We show the advantage of our approach for closely related species over current alignment and k-mer methods and suggest a confide… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 69 publications
0
1
0
Order By: Relevance
“…Another pitfall in the use of NGS technology for routine fungal pathogen identification remains the need of improvement in processing the huge amount of NGS data both at infrastructure level (server and memory power) and bioinformatic skills availability (algorithms and expert technicians). To this aim, an ad hoc pipeline using machine learning (ML) classifiers has been developed as an alternative method for assigning individual error-prone sequence-long reads to taxa [ 28 , 179 , 180 ]. In metabarcoding studies, including those concerning human diseases, ML modeling will help in prediction of disease outputs and in deciphering environmental factors shaping the microbial composition also in agriculture and in natural ecosystems [ 29 , 181 , 182 , 183 , 184 ].…”
Section: Discussionmentioning
confidence: 99%
“…Another pitfall in the use of NGS technology for routine fungal pathogen identification remains the need of improvement in processing the huge amount of NGS data both at infrastructure level (server and memory power) and bioinformatic skills availability (algorithms and expert technicians). To this aim, an ad hoc pipeline using machine learning (ML) classifiers has been developed as an alternative method for assigning individual error-prone sequence-long reads to taxa [ 28 , 179 , 180 ]. In metabarcoding studies, including those concerning human diseases, ML modeling will help in prediction of disease outputs and in deciphering environmental factors shaping the microbial composition also in agriculture and in natural ecosystems [ 29 , 181 , 182 , 183 , 184 ].…”
Section: Discussionmentioning
confidence: 99%