2019
DOI: 10.1007/978-3-030-23873-5_17
|View full text |Cite
|
Sign up to set email alerts
|

GeCo2: An Optimized Tool for Lossless Compression and Analysis of DNA Sequences

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 17 publications
(25 citation statements)
references
References 26 publications
0
25
0
Order By: Relevance
“…e DNA sequence corpus contains 17 DNA sequences with different sizes and reflects the main domains and kingdoms. HRCM selected BuEb sequence which is a DNA sequence corpus as the reference sequence to compress others and compared the results with widely used general-purpose compression methods gzip, bzip2, lzma, and ppmd and the state-of-the-art special purpose genome compression methods MFCompress [32] and GeCo2 [34]. MFCompress tool was tested in default mode and the best mode (denoted by MFC-2 and MFC-3, respectively, according to the original paper).…”
Section: Compression Results and Analysis Of Only One-versionmentioning
confidence: 99%
See 1 more Smart Citation
“…e DNA sequence corpus contains 17 DNA sequences with different sizes and reflects the main domains and kingdoms. HRCM selected BuEb sequence which is a DNA sequence corpus as the reference sequence to compress others and compared the results with widely used general-purpose compression methods gzip, bzip2, lzma, and ppmd and the state-of-the-art special purpose genome compression methods MFCompress [32] and GeCo2 [34]. MFCompress tool was tested in default mode and the best mode (denoted by MFC-2 and MFC-3, respectively, according to the original paper).…”
Section: Compression Results and Analysis Of Only One-versionmentioning
confidence: 99%
“…DELIMINATE achieves better compression ratio than general-purpose compression methods although it uses almost the same compression time. In 2011, 2013, 2016, and 2019, respectively, Pinho et al published DNAEnc3 [31], MFCompress [32], GeCo [33], and GeCo2 [34] based on Markov models. DNAEnc3 partitions sequence to nonoverlapping blocks of fixed size, which are then encoded by the best one of the Markov models of different orders.…”
Section: Related Workmentioning
confidence: 99%
“…There are also algorithms to cope with low computational resources. From our experience, we would highlight XM [ 68 ] and GeCo/GeCo2 [ 86 , 88 ] given their ability to compress genomic sequences with high compression ratios. On average, XM is slightly better concerning compression ratio (approximately 0.4% and 0.2% over GeCo and GeCo2, respectively).…”
Section: Resultsmentioning
confidence: 99%
“…It has sub-programs to deal with inverted repeats and uses cache-hashes for deeper context models. In GeCo2 [ 88 ], the mixture of models is enhanced, where each context model or tolerant context model now has a specific decay factor. Additionally, specific cache-hash sizes and the ability to run only a context model with inverted repeats are available.…”
Section: Introductionmentioning
confidence: 99%
“…Figure 1 depicts the complexity profiles of four human Herpesvirus whole genomes using the same scale, where redundant regions are highlighted with blue (below a Bps of one). GTO uses GeCo2 [6] and AC [7] compressors to estimate the local complexity of DNA and amino acid sequences, respectively. However, GTO is not limited to use these data compressors.…”
Section: Validationmentioning
confidence: 99%