2021
DOI: 10.1101/2021.08.16.456517
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SemiBin: Incorporating information from reference genomes with semi-supervised deep learning leads to better metagenomic assembled genomes (MAGs)

Abstract: Metagenomic binning is the step in building metagenome-assembled genomes (MAGs) when sequences predicted to originate from the same genome are automatically grouped together. The most widely-used methods for binning are reference-independent, operating de novo and allow the recovery of genomes from previously unsampled clades. However, they do not leverage the knowledge in existing databases. Here, we propose SemiBin, an open source tool that uses neural networks to implement a semi-supervised approach, i.e. S… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 57 publications
(91 reference statements)
0
3
0
Order By: Relevance
“…Since different binning tools reconstruct genomes at various levels of completeness, a bin aggregation software, i.e., DAS Tool v1.1.0 ( 20 ), was used to integrate the results of bin predictions made by CONCOCT, MetaBAT2, and MaxBin2 to optimize the selection on nonredundant, high-quality bin sets using default parameters. Recently published binning algorithms, including variational autoencoders for metagenomic binning (VAMB) ( 66 ) and SemiBin ( 67 ), were not considered in this study but are important alternatives to consider in the future to facilitate the recovery of high-quality MAGs. Bin statistics, including total size, number of contigs, N 50 , GC content, etc., were obtained using the anvi-summarize function of anvi’o, while estimates of quality (completeness, redundancy, strain heterogeneity, etc.)…”
Section: Methodsmentioning
confidence: 99%
“…Since different binning tools reconstruct genomes at various levels of completeness, a bin aggregation software, i.e., DAS Tool v1.1.0 ( 20 ), was used to integrate the results of bin predictions made by CONCOCT, MetaBAT2, and MaxBin2 to optimize the selection on nonredundant, high-quality bin sets using default parameters. Recently published binning algorithms, including variational autoencoders for metagenomic binning (VAMB) ( 66 ) and SemiBin ( 67 ), were not considered in this study but are important alternatives to consider in the future to facilitate the recovery of high-quality MAGs. Bin statistics, including total size, number of contigs, N 50 , GC content, etc., were obtained using the anvi-summarize function of anvi’o, while estimates of quality (completeness, redundancy, strain heterogeneity, etc.)…”
Section: Methodsmentioning
confidence: 99%
“…We run Graphbin with the output of MetaBAT as initial bins, which are required by this tool. Finally, we ran SemiBin ( Pan et al , 2021 ), a recently proposed deep learning binner, using one of the pretrain models provided by the authors (ocean model) as well as training on our own data with the default parameters.…”
Section: Methodsmentioning
confidence: 99%
“…Other deep learning approaches have also been recently proposed. LRBinner ( Wickramarachchi and Lin, 2021 ) adapts VAEs to long-reads, while SemiBin ( Pan et al , 2021 ) uses a semi-supervised siamese neural network with must-link and cannot-link constraints obtained with reference genomes.…”
Section: Introductionmentioning
confidence: 99%