2019
DOI: 10.1101/559807
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

FQSqueezer: k-mer-based compression of sequencing data

Abstract: Motivation:The amount of genomic data that needs to be stored is huge. Therefore it is not surprising that a lot of work has been done in the field of specialized data compression of FASTQ files. The existing algorithms are, however, still imperfect and the best tools produce quite large archives. Results: We present FQSqueezer, a novel compression algorithm for sequencing data able to process single-and paired-end reads of variable lengths. It is based on the ideas from the famous prediction by partial matchi… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 28 publications
0
2
0
Order By: Relevance
“…Since sequence headers contribute marginally to the sizes of FASTA/FASTQ les, they are compressed with well-established token-based method analogously as in FQSqueezer [23] or ENANO.…”
Section: Colord Overviewmentioning
confidence: 99%
“…Since sequence headers contribute marginally to the sizes of FASTA/FASTQ les, they are compressed with well-established token-based method analogously as in FQSqueezer [23] or ENANO.…”
Section: Colord Overviewmentioning
confidence: 99%
“…Therefore, many techniques have been proposed to store the data in compressed form. In [26], Sebastian proposed a compression algorithm to process single and paired reads. The proposed technique is based on partial matching and dynamic Markov coder algorithm.…”
Section: Literature Reviewmentioning
confidence: 99%