2022
DOI: 10.1186/s13015-022-00221-z
|View full text |Cite
|
Sign up to set email alerts
|

Binning long reads in metagenomics datasets using composition and coverage information

Abstract: Background Advancements in metagenomics sequencing allow the study of microbial communities directly from their environments. Metagenomics binning is a key step in the species characterisation of microbial communities. Next-generation sequencing reads are usually assembled into contigs for metagenomics binning mainly due to the limited information within short reads. Third-generation sequencing provides much longer reads that have lengths similar to the contigs assembled from short reads. Howev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(15 citation statements)
references
References 36 publications
0
15
0
Order By: Relevance
“…The VAE was successfully applied to contigs and long-reads binning 25,26 and showed better binning performance than classical dimensional reduction algorithms such as principal component analysis (PCA; Supplementary Figure 3 ). For clustering linked-reads in the latent space of VAE, the classical k-means was not optimized to process the highly imbalanced metagenomic data due to its instability to choose proper initial centroids.…”
Section: Discussionmentioning
confidence: 99%
“…The VAE was successfully applied to contigs and long-reads binning 25,26 and showed better binning performance than classical dimensional reduction algorithms such as principal component analysis (PCA; Supplementary Figure 3 ). For clustering linked-reads in the latent space of VAE, the classical k-means was not optimized to process the highly imbalanced metagenomic data due to its instability to choose proper initial centroids.…”
Section: Discussionmentioning
confidence: 99%
“…The resulting bins were refined using MetaWRAP v1.3 bin_refinement module 77 and refined bins were assessed for contamination and completion with CheckM v1.2.0 78 . In the second approach, the binning program LRBinner v.2.1 79 , which is specialized in long reads, was utilized to bin metagenomic contigs. The third approach applied the long-read binning pipeline Nano-Phase v.0.2, which utilizes MetaBAT2 and MaxBin2, and has been validated on the ZymoBIOMICS gut microbiome standard 80 .…”
Section: Methodsmentioning
confidence: 99%
“…Binning of long-reads presents a set of challenges such as a lack of coverage information, which is the information of an average number of reads that is mapped to a position in a reference genome, relatively high error rates, and varying degree of species coverage [ 53 ]. When compared to contigs from short-reads, the read length is significantly longer, which requires a unique binning algorithm.…”
Section: Processingmentioning
confidence: 99%
“…Then, samples are taken from latent space for the decoder to produce an output that is ideally identical to the original input. When compared to MetaBBC-LR, LRBinner is capable of producing bins with better completeness, lower contamination, better estimation of the number of bins, and overall higher precisions [ 53 ] (Fig. 4 C).…”
Section: Processingmentioning
confidence: 99%