2019
DOI: 10.1186/s13059-019-1632-4
|View full text |Cite
|
Sign up to set email alerts
|

Skmer: assembly-free and alignment-free sample identification using genome skims

Abstract: The ability to inexpensively describe taxonomic diversity is critical in this era of rapid climate and biodiversity changes. The recent genome-skimming approach extends current barcoding practices beyond short markers by applying low-pass sequencing and recovering whole organelle genomes computationally. This approach discards the nuclear DNA, which constitutes the vast majority of the data. In contrast, we suggest using all unassembled reads. We introduce an assembly-free and alignment-free tool, Skmer, to co… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
142
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 105 publications
(142 citation statements)
references
References 82 publications
0
142
0
Order By: Relevance
“…These approaches rely on assembly construction pipelines (e.g., Jin et al, 2019) to remove contaminants (i.e., to avoid mis-assembly or to filter out mis-assembled contigs). Elsewhere, we have advocated going beyond organelle genomes and using all reads in an assembly-free fashion to increase the resolution of taxonomic identification (Balaban et al, 2019;Sarmashghi et al, 2019). However, this goal has been hampered by the presence of contaminants.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…These approaches rely on assembly construction pipelines (e.g., Jin et al, 2019) to remove contaminants (i.e., to avoid mis-assembly or to filter out mis-assembled contigs). Elsewhere, we have advocated going beyond organelle genomes and using all reads in an assembly-free fashion to increase the resolution of taxonomic identification (Balaban et al, 2019;Sarmashghi et al, 2019). However, this goal has been hampered by the presence of contaminants.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, Sarmashghi et al, 2019 developed a method, Skmer, that accurately computes genomic distance between genome skims by simply analyzing k-mers (short substrings of length k) in both genome skims.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Under standard implementations where computing set unions and set intersections has a linear time complexity in the sizes of the sets, this filtering step has a time complexity of O(n 2 L) for pairwise read alignment. Recently Jaccard similarity has been used in a variety of applications such as genome skimming [Denver et al, 2016], and in new methods to compare whole genomes and study taxonomic diversity in the microbiome [Ondov et al, 2016, Sarmashghi et al, 2019.…”
Section: Introductionmentioning
confidence: 99%
“…In Mash [39], the MinHash [6] approach is used to reduce the input sequences to small 'sketches' which can be used to rapidly approximate the Jaccard index. Skmer [47] is a further improvement of this approach. In a previous paper, we proposed another way to infer evolutionary distances between DNA sequences based on the number of word matches between them, and we generalized this to so-called spaced-word matches [36].…”
Section: Introductionmentioning
confidence: 99%