2012
DOI: 10.1186/gb-2012-13-12-r122
|View full text |Cite
|
Sign up to set email alerts
|

Ray Meta: scalable de novo metagenome assembly and profiling

Abstract: Voluminous parallel sequencing datasets, especially metagenomic experiments, require distributed computing for de novo assembly and taxonomic profiling. Ray Meta is a massively distributed metagenome assembler that is coupled with Ray Communities, which profiles microbiomes based on uniquely-colored k-mers. It can accurately assemble and profile a three billion read metagenomic experiment representing 1,000 bacterial genomes of uneven proportions in 15 hours with 1,024 processor cores, using only 1.5 GB per co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
416
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 541 publications
(417 citation statements)
references
References 51 publications
1
416
0
Order By: Relevance
“…Metagenomes from rumen, human gut, and permafrost soil sequencing could be assembled only by discarding low-abundance sequences before assembly (2,4,5). Although many metagenome-specific assemblers have been developed recently for the assembly of low-complexity communities, they cannot work with the volume of reads necessary to achieve high coverage for extremely diverse environmental metagenomes (10)(11)(12).…”
mentioning
confidence: 99%
“…Metagenomes from rumen, human gut, and permafrost soil sequencing could be assembled only by discarding low-abundance sequences before assembly (2,4,5). Although many metagenome-specific assemblers have been developed recently for the assembly of low-complexity communities, they cannot work with the volume of reads necessary to achieve high coverage for extremely diverse environmental metagenomes (10)(11)(12).…”
mentioning
confidence: 99%
“…The adapters were trimmed from the reads using Trimmomatic-0.30 (Bolger et al, 2014) quality checked with Sickle (Joshi and Fass, 2011), and assembled using Ray with default parameters, and 23 as the k-value (Boisvert et al, 2012). The sequence data have been submitted to the GenBank databases under accession No.…”
Section: Structural Proteinsmentioning
confidence: 99%
“…After constructing an A-Bruijn graph, one faces the problem of finding a path in this graph that corresponds to traversing the genome and then correcting errors in the sequence spelled by this path (this genomic path does not have to traverse all edges of the graph). Because the long reads are merely paths in the A-Bruijn graph, one can use the path extension paradigm (37)(38)(39) to derive the genomic path from these (shorter) read-paths. exSPAnder (38) is a module of the SPAdes assembler (24) that finds a genomic The histograms of the number of 15-mers with given frequencies for the ECOLI dataset from Escherichia coli.…”
Section: For Details)mentioning
confidence: 99%
“…Hence, the A-Bruijn graph can function as an oracle, from which one can efficiently identify the overlaps of a given read with all other reads by considering all possible overlaps at once. The genome is assembled by repeatedly applying this procedure and borrowing the path extension paradigm from short read assemblers (37)(38)(39).…”
Section: For Details)mentioning
confidence: 99%