DNA nanotechnology often requires collections of oligonucleotides called "DNA free energy gap codes" that do not produce erroneous crosshybridizations in a competitive muliplexing environment. This paper addresses the question of how to design these codes to accomplish a desired amount of work within an acceptable error rate. Using a statistical thermodynamic and probabilistic model of DNA code fidelity and mathematical random coding theory methods, theoretical lower bounds on the size of DNA codes are given. More importantly, DNA code design parameters (e.g., strand number, strand length and sequence composition) needed to achieve experimental goals are identified.
We discuss the concept of t-gap block isomorphic subsequences and use it to describe new abstract string metrics that are similar to the Levenshtein insertion-deletion metric. Some of the metrics that we define can be used to model a thermodynamic distance function on single-stranded DNA sequences. Our model captures a key aspect of the nearest neighbor thermodynamic model for hybridized DNA duplexes. One version of our metric gives the maximum number of stacked pairs of hydrogen bonded nucleotide base pairs that can be present in any secondary structure in a hybridized DNA duplex without pseudoknots. Thermodynamic distance functions are important components in the construction of DNA codes, and DNA codes are important components in biomolecular computing, nanotechnology, and other biotechnical applications that employ DNA hybridization assays. We show how our new distances can be calculated by using a dynamic programming method, and we derive a Varshamov-Gilbert-like lower bound on the size of some of codes using these distance functions as constraints. We also discuss software implementation of our DNA code design methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.