2013
DOI: 10.3844/jcssp.2013.690.698
|View full text |Cite
|
Sign up to set email alerts
|

A Deoxyribonucleic Acid Compression Algorithm Using Auto-Regression and Swarm Intelligence

Abstract: DNA compression challenge has become a major task for many researchers as a result of exponential increase of produced DNA sequences in gene databases; in this research we attempt to solve the DNA compression challenge by developing a lossless compression algorithm. The proposed algorithm works in horizontal mode using a substitutional-statistical technique which is based on Auto Regression modeling (AR), the model parameters are determined using Particle Swarm Optimization (PSO). This algorithm is called Swar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…Many file formats are used to store the genomic data such as FASTA, FASTQ, BAM/SAM and VCF/BCF [7][8][9][10][11][12][13][14][15]. In these file formats, in addition to the raw genomic or protein sequences, other information, such as in FASTQ file identifiers and quality scores are added.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Many file formats are used to store the genomic data such as FASTA, FASTQ, BAM/SAM and VCF/BCF [7][8][9][10][11][12][13][14][15]. In these file formats, in addition to the raw genomic or protein sequences, other information, such as in FASTQ file identifiers and quality scores are added.…”
Section: Introductionmentioning
confidence: 99%
“…In June 2019, approximately 329,835,282,370 bases were generated [18]. All of these data are stored in special databases, which have been developed by the scientific community such as the 1000 Genomes Project [6], the International Cancer Genome Project [19] and the ENCODE project [12,17,20]. This advancement has caused the production of a vast amount of redundant data at a much higher rate than the older technologies, which will increase in the future and may go beyond the limit of storage and bandwidth capacity [3][4][5].…”
mentioning
confidence: 99%
“…So, these facts conclude that DNA sequences should be compressed. Human DNA almost has 3 billion bases and among then more than 99% are the same in all human [8], [9]. Data compression reveals certain theoretical ideas such as entropy, mutual information and complexity between sequences of different genomes.…”
Section: Introductionmentioning
confidence: 99%
“…A particle updates its velocity and position based on its inertia, own experience and gained knowledge from other particles in the swarm, aiming to find the optimal solution of the problem [17].…”
Section: Particle Swarm Optimizationmentioning
confidence: 99%