2015
DOI: 10.1186/s13742-015-0097-y
|View full text |Cite
|
Sign up to set email alerts
|

GenomeTester4: a toolkit for performing basic set operations - union, intersection and complement on k-mer lists

Abstract: BackgroundK-mer-based methods of genome analysis have attracted great interest because they do not require genome assembly and can be performed directly on sequencing reads. Many analysis tasks require one to compare k-mer lists from different sequences to find words that are either unique to a specific sequence or common to many sequences. However, no stand-alone k-mer analysis tool currently allows one to perform these algebraic set operations.FindingsWe have developed the GenomeTester4 toolkit, which contai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
47
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 43 publications
(47 citation statements)
references
References 14 publications
0
47
0
Order By: Relevance
“…‘PhenotypeSeeker modeling’ takes either assembled contigs or raw-read data as an input and builds a statistical model for phenotype prediction. The method starts with counting all possible k -mers from the input genomes, using the GenomeTester4 software package [14], followed by k -mer filtering by their frequency in strains. Subsequently, the k -mer selection for regression analysis is performed.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…‘PhenotypeSeeker modeling’ takes either assembled contigs or raw-read data as an input and builds a statistical model for phenotype prediction. The method starts with counting all possible k -mers from the input genomes, using the GenomeTester4 software package [14], followed by k -mer filtering by their frequency in strains. Subsequently, the k -mer selection for regression analysis is performed.…”
Section: Resultsmentioning
confidence: 99%
“…All operations with k -mers are performed using the GenomeTester4 software package containing the glistmaker, glistquery and glistcompare programs [14]. At first, all k -mers from all samples are counted with glistmaker, which takes either FASTA or FASTQ files as an input and enables us to set the k -mer length up to 32 nucleotides.…”
Section: Methodsmentioning
confidence: 99%
“…For instance, there are many alternatives for the script producing k -mer tables. One may use GenomeTester4 ( Kaplinski et al, 2015 ) or any of the tools reviewed and compared by Pérez et al (2016) , including Jellyfish ( Marçais and Kingsford, 2011 ) and Tallymer ( Kurtz et al, 2008 ).…”
Section: Discussionmentioning
confidence: 99%
“…The final database contains specific k-mers for each internal and external node (strain) represented in the guide tree and an index file containing the database structure and k-mer counts. In our study, we created the databases with k=32 using the GenomeTester4 software [17]. Longer kmers would give only strain-specific k-mers, shorter k-mers would give only node specific k-mers.…”
Section: Building the K-mer Databasementioning
confidence: 99%
“…All 2,758 available bacterial genomes from the NCBI RefSeq database (release 65) were used. For every two bacteria, the expected amount of shared k-mers (Eshared) was calculated and the observed amount (Oshared) was counted using the GenomeTester4 software [17]. The expected value was calculated by assuming their genome sequences were random strings.…”
Section: Multi-locus Sequence Typing Of E Coli Samplesmentioning
confidence: 99%