2016
DOI: 10.1101/gr.210641.116
|View full text |Cite
|
Sign up to set email alerts
|

Centrifuge: rapid and sensitive classification of metagenomic sequences

Abstract: Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers. The system uses an indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (4.2 GB for 4078 bacterial and 200 archaeal genomes) and classifies sequences at very high speed, allowing it to proc… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
814
0
1

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 1,158 publications
(880 citation statements)
references
References 32 publications
3
814
0
1
Order By: Relevance
“…We also note the recent availability of rapid library kits including PCR-based amplification for low input samples tagged with adaptors, however the additional PCR step may co-amplify contaminants and entails further processing steps and time associated with PCR amplification and clean-up. Taxonomic classification of metagenomic reads Accordingly, about 15% of the reads (114) generated in-situ were classified by Centrifuge operating a bacterial, archaeal, viral and eukaryotic database (Kim et al 2016). As seen in Figure 3, bacterial sequences dominated the sample, with minor contributions Archaea, Eukaryota and Viruses were represented respectively by 1, 5 and 2 of the successfully classified sequences.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We also note the recent availability of rapid library kits including PCR-based amplification for low input samples tagged with adaptors, however the additional PCR step may co-amplify contaminants and entails further processing steps and time associated with PCR amplification and clean-up. Taxonomic classification of metagenomic reads Accordingly, about 15% of the reads (114) generated in-situ were classified by Centrifuge operating a bacterial, archaeal, viral and eukaryotic database (Kim et al 2016). As seen in Figure 3, bacterial sequences dominated the sample, with minor contributions Archaea, Eukaryota and Viruses were represented respectively by 1, 5 and 2 of the successfully classified sequences.…”
Section: Resultsmentioning
confidence: 99%
“…Bioinformatics MinION output files were processed and converted to .fastq using poRe (https://sourceforge.net/projects/rpore/, version 0.17), which was also used to generate sequencing statistics for the minION run (https://github.com/geomicrosoares). centrifuge (https://github.com/infphilo/centrifuge, version 1.0.3-beta (Kim, Song et al 2016) was run to classify passed reads using a complete bacterial, archaeal, viral and eukaryotic database (--bmax 1342177280) and pavian (https://github.com/fbreitwieser/pavian, version 0.3(Breitwieser and Salzberg 2016) ) was used to plot Sankey diagrams depicting multi-domain profiles of the microbial communities.…”
Section: Site Descriptionmentioning
confidence: 99%
“…We also found that taxonomic annotation represents the most powerful information type in our simulation. For comparson, we added scores for NBC (Rosen, Reichenberger & Rosenfeld, 2011), a classifier based on nucleotide composition with in-sample training using 5-mers and 15-mers, and centrifuge (Kim et al, 2016), a similarity-based classifier both with in-sample and reference data. These programs were given the same information as the corresponding submodels and they rank close to these.…”
Section: Maximum Likelihood Classificationmentioning
confidence: 99%
“…These methods are often slower than BLAST alone, rendering them computationally prohibitive for large-scale analysis of many millions of short reads. However, the recent development of Centrifuge (Kim et al, 2016) has significantly improved the scalability of the alignment-based algorithm using FM-index. Besides using genomic sequences as reference, the recently published tool Kaiju (Menzel and Krogh, 2015) performs alignments towards protein sequences, achieving faster classification speed than existing tools.…”
Section: Introductionmentioning
confidence: 99%