2013
DOI: 10.1093/bioinformatics/btt310
|View full text |Cite
|
Sign up to set email alerts
|

Informed and automated k-mer size selection for genome assembly

Abstract: Our tool KmerGenie is freely available at: http://kmergenie.bx.psu.edu/.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

5
578
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 684 publications
(584 citation statements)
references
References 20 publications
5
578
0
1
Order By: Relevance
“…In the k-mer profiling problem [6,7] we are given a string T , an interval [k 1 ..k 2 ] of lengths and an interval [f 1 ..f 2 ] of frequencies, and we are asked to compute the matrix profile[k 1 ..k 2 , f 1 ..f 2 ] defined as follows:…”
Section: Kernels and Complexity Measures On K-mersmentioning
confidence: 99%
See 1 more Smart Citation
“…In the k-mer profiling problem [6,7] we are given a string T , an interval [k 1 ..k 2 ] of lengths and an interval [f 1 ..f 2 ] of frequencies, and we are asked to compute the matrix profile[k 1 ..k 2 , f 1 ..f 2 ] defined as follows:…”
Section: Kernels and Complexity Measures On K-mersmentioning
confidence: 99%
“…In practice profile is often computed by running a k-mer extraction algorithm k 2 − k 1 + 1 times, and by scanning the output of all such runs (see e.g. [6] and references therein). The following lemma shows that we can compute profile in just one pass over the BWT of the input string, and in linear time in the size of profile:…”
Section: Kernels and Complexity Measures On K-mersmentioning
confidence: 99%
“…were analysed in Kmergenie (Chikhi and Medvedev, 2014) using a variety of kmers (15,20,25,30,35,40,45) and assuming a diploid model, to estimate the unique haploid average coverage, which was used to estimate genome size by dividing the total length of the reads (16552689327 bp) by the unique average coverage (peak height ranging from 15-20). C) Mapping the combined datasets to 18737 open reading frames derived from a salivary gland transcriptome (manuscript in preparation) using CLC Genomics Workbench.…”
Section: Estimating Genome Sizementioning
confidence: 99%
“…The quality-fi ltered sequence reads were assembled into a number of contig sequences. The analysis has been performed using the "De novo assembly" option of the CL C Genomics Workbench version 6.0.4.The optimal k-mer size was automatically determined using KmerGenie [3]. The contigs were linked and placed into scaffolds or supercontigs.…”
mentioning
confidence: 99%