2023
DOI: 10.1038/s41598-023-29267-8
|View full text |Cite
|
Sign up to set email alerts
|

Reference-free lossless compression of nanopore sequencing reads using an approximate assembly approach

Abstract: The amount of data produced by genome sequencing experiments has been growing rapidly over the past several years, making compression important for efficient storage, transfer and analysis of the data. In recent years, nanopore sequencing technologies have seen increasing adoption since they are portable, real-time and provide long reads. However, there has been limited progress on compression of nanopore sequencing reads obtained in FASTQ files since most existing tools are either general-purpose or specializ… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 23 publications
0
2
0
Order By: Relevance
“…This has prompted researchers to study long-read long compression techniques in greater depth. To reduce the size of raw FASTQ raw sequencing data, several compression techniques have been proposed, including Picopore, ENANO, RENANO, NanoSpring, FastqCLS, and CoLoRd ( Gigante, 2017 ; Dufort y Álvarez et al, 2020 , 2021 ; Meng et al, 2021 ; Kokot et al, 2022 ; Lee and Song, 2022 ). As an example, Picopore ( Gigante, 2017 ) provides a software suite that contains three compression methods: raw, lossless and deep lossless compression.…”
Section: Bioinformatics Of Nanopore Sequencingmentioning
confidence: 99%
See 1 more Smart Citation
“…This has prompted researchers to study long-read long compression techniques in greater depth. To reduce the size of raw FASTQ raw sequencing data, several compression techniques have been proposed, including Picopore, ENANO, RENANO, NanoSpring, FastqCLS, and CoLoRd ( Gigante, 2017 ; Dufort y Álvarez et al, 2020 , 2021 ; Meng et al, 2021 ; Kokot et al, 2022 ; Lee and Song, 2022 ). As an example, Picopore ( Gigante, 2017 ) provides a software suite that contains three compression methods: raw, lossless and deep lossless compression.…”
Section: Bioinformatics Of Nanopore Sequencingmentioning
confidence: 99%
“…In comparison to ENANO, RENANO ( Dufort y Álvarez et al, 2021 ) achieves significantly improved compression of read sequences but is limited to aligned data with a usable reference. In contrast, NanoSpring ( Meng et al, 2021 ) is a reference-free tool that relies on approximate assembly methods in order to achieve compression gains, but it requires more time and memory to accomplish compression gains. The FastqCLS ( Lee and Song, 2022 ) compression algorithm uses read reordering to compress long reads of long sequencing data without sacrificing information and performs well in terms of compression ratio.…”
Section: Bioinformatics Of Nanopore Sequencingmentioning
confidence: 99%