2022
DOI: 10.23919/jcn.2022.000008
|View full text |Cite
|
Sign up to set email alerts
|

Iterative coding scheme satisfying GC balance and run-length constraints for DNA storage with robustness to error propagation

Abstract: In this paper, we propose a novel iterative encoding algorithm for DNA storage to satisfy both the GC balance and run-length constraints using a greedy algorithm. DNA strands with run-length more than three and the GC balance ratio far from 50% are known to be prone to errors. The proposed encoding algorithm stores data with high flexibility of run-length at most 𝑚 and GC balance between 0.5 ± 𝛼 for arbitrary 𝑚 and 𝛼. More importantly, we propose a novel mapping method to reduce the average bit error compa… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…For two DNA sequences, a and b, we use H(a, b) to denote the number of different bases at position i of the sequence (a, b) . The Hamming distance is calculated using the following formula[26]: …”
Section: Restrictions On Dna Codingmentioning
confidence: 99%
See 4 more Smart Citations
“…For two DNA sequences, a and b, we use H(a, b) to denote the number of different bases at position i of the sequence (a, b) . The Hamming distance is calculated using the following formula[26]: …”
Section: Restrictions On Dna Codingmentioning
confidence: 99%
“…GC content refers to the ratio of the total number of G and C bases to the total number of bases in a DNA sequence, which is closely related to the stability of the DNA sequence and the melting point. The designed DNA sequence should be within the range of 40% ≤ GC(n) ≤ 60% to ensure that the constraints of GC content are met, and the formula of GC(n) is as follows[26]: where GC(n) denotes the GC content of the sequence, | G | and | C | represent the number of bases G and C, respectively, in the sequence n , and | n | denotes the total number of bases in the sequence.…”
Section: Restrictions On Dna Codingmentioning
confidence: 99%
See 3 more Smart Citations