2022
DOI: 10.1093/bioinformatics/btac845
|View full text |Cite
|
Sign up to set email alerts
|

KMCP: accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping

Abstract: Motivation The growing number of microbial reference genomes enables the improvement of metagenomic profiling accuracy but also imposes greater requirements on the indexing efficiency, database size, and runtime of taxonomic profilers. Additionally, most profilers focus mainly on bacterial, archaeal, and fungal populations, while less attention is paid to viral communities. Results We present KMCP, a novel k-mer-based metagen… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(28 citation statements)
references
References 74 publications
(107 reference statements)
0
15
0
Order By: Relevance
“…Specifically, these synthetic communities had 50 present species at 95-97.5% ANI and 150 genomes between 85-90% ANI to the nearest ANI genome in the database ( Methods ). We compared sylph’s species-leveling profiling against three other methods, Bracken [10] (with Kraken2 [8]), KMCP [30], and ganon [9]. We chose Bracken due to its wide use, KMCP because it achieved the best CAMI2 marine benchmarks in its publication, and ganon because it is relatively time and memory-efficient.…”
Section: Resultsmentioning
confidence: 99%
“…Specifically, these synthetic communities had 50 present species at 95-97.5% ANI and 150 genomes between 85-90% ANI to the nearest ANI genome in the database ( Methods ). We compared sylph’s species-leveling profiling against three other methods, Bracken [10] (with Kraken2 [8]), KMCP [30], and ganon [9]. We chose Bracken due to its wide use, KMCP because it achieved the best CAMI2 marine benchmarks in its publication, and ganon because it is relatively time and memory-efficient.…”
Section: Resultsmentioning
confidence: 99%
“…The following classifiers and profilers are supported in version 1.1.0 of nf-core/taxprofiler: Kraken2 (Wood, Lu, and Langmead 2019), Bracken (Lu et al 2017), KrakenUniq (F. P. Breitwieser, Baker, and Salzberg 2018), Centrifuge (Kim et al 2016), MALT (Vågene et al 2018), DIAMOND (Buchfink, Reuter, and Drost 2021), Kaiju (Menzel, Ng, and Krogh 2016), MetaPhlAn (Blanco-Míguez et al 2023), mOTUs Ruscheweyh et al 2022), ganon (Piro et al 2020), KMCP (Shen et al 2023).…”
Section: Methodsmentioning
confidence: 99%
“…Additionally on top of this, also simultaneously for each of the classifiers, an arbitrary number of databases as supplied by the user. As of version 1.1.0, the following classifiers and profilers are available: Kraken2 (Wood, Lu, and Langmead 2019), Bracken (Lu et al 2017), KrakenUniq (F. P. Breitwieser, Baker, and Salzberg 2018), Centrifuge (Kim et al 2016), MALT (Vågene et al 2018), DIAMOND (Buchfink, Reuter, and Drost 2021), Kaiju (Menzel, Ng, and Krogh 2016), MetaPhlAn (Blanco-Míguez et al 2023), mOTUs (Ruscheweyh et al 2022), ganon (Piro et al 2020), and KMCP (Shen et al 2023). Databases are also supplied via a input TSV file, which also allows per-database custom classification parameters - meaning a given database can be supplied multiple times each with different parameters or multiple different databases per profiler.…”
Section: Descriptionmentioning
confidence: 99%
“…We compare ganon2 (v2.0.0) with the state-of-the-art methods: kmcp (v0.9.2) [18], kraken2 (v2.1.2) [19], bracken (v2.8) [20], and metacache (v2.3.1) [21]. Methods were selected based on the following criteria: open-source code, actively maintained and/or highly used, scalable to handle very large data in terms of database construction and sequence classification, execution in doable time (hours/few days for building, minutes/hours per sample), possibility to create custom databases with nucleotide sequences, ability to perform taxonomic binning and/or profiling.…”
Section: Evaluationsmentioning
confidence: 99%