2020
DOI: 10.7717/peerj.8966
|View full text |Cite
|
Sign up to set email alerts
|

Deconvolute individual genomes from metagenome sequences through short read clustering

Abstract: Metagenome assembly from short next-generation sequencing data is a challenging process due to its large scale and computational complexity. Clustering short reads by species before assembly offers a unique opportunity for parallel downstream assembly of genomes with individualized optimization. However, current read clustering methods suffer either false negative (under-clustering) or false positive (over-clustering) problems. Here we extended our previous read clustering software, SpaRC, by exploiting statis… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…LPA is capable of resolving genomes with shared reads and has near linear computational performance. SpaRC can be run at two different modes: "local mode" only cluster reads based on their overlap, while "global mode" further clusters the results from local mode based on multiple sample statistics (Li et al, 2020).…”
Section: The Hybrid-lpa Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…LPA is capable of resolving genomes with shared reads and has near linear computational performance. SpaRC can be run at two different modes: "local mode" only cluster reads based on their overlap, while "global mode" further clusters the results from local mode based on multiple sample statistics (Li et al, 2020).…”
Section: The Hybrid-lpa Algorithmmentioning
confidence: 99%
“…We previously developed a scalable metagenome clustering tool called SpaRC (Shi et al, 2018;Li et al, 2020) based on Apache Spark. SpaRC can form pure and complete clusters with long-read sequencing technologies.…”
Section: Introductionmentioning
confidence: 99%