2020
DOI: 10.1038/s41598-020-74091-z
|View full text |Cite
|
Sign up to set email alerts
|

Codon optimization with deep learning to enhance protein expression

Abstract: Heterologous expression is the main approach for recombinant protein production ingenetic synthesis, for which codon optimization is necessary. The existing optimization methods are based on biological indexes. In this paper, we propose a novel codon optimization method based on deep learning. First, we introduce the concept of codon boxes, via which DNA sequences can be recoded into codon box sequences while ignoring the order of bases. Then, the problem of codon optimization can be converted to sequence anno… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
63
0
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 124 publications
(65 citation statements)
references
References 45 publications
0
63
0
2
Order By: Relevance
“…New and improved methodologies will continue to be explored to optimize the stability and translation efficiency of mRNA and the delivery of LNP-mRNA complexes. Novel approaches, including deep learning and genome-wide screening method to identify the optimal codon usage and UTR design of mRNA are already being tested empirically [72,115]. Recent studies have screened a library of the total mRNA containing 5'-UTR using computational and empirical analyses and determined the optimal 5'-UTR for the maximum RNA stability and translation efficiency in vitro and in vivo [116,117].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…New and improved methodologies will continue to be explored to optimize the stability and translation efficiency of mRNA and the delivery of LNP-mRNA complexes. Novel approaches, including deep learning and genome-wide screening method to identify the optimal codon usage and UTR design of mRNA are already being tested empirically [72,115]. Recent studies have screened a library of the total mRNA containing 5'-UTR using computational and empirical analyses and determined the optimal 5'-UTR for the maximum RNA stability and translation efficiency in vitro and in vivo [116,117].…”
Section: Discussionmentioning
confidence: 99%
“…Two additional codon optimization methods involve the use of the codons with human bias and the maximum adaptation index [69,70]. Other bioinformatics approaches can be explored to further enhance the stability of mRNA, e.g., via design of the secondary structures and prediction of the expression level based on deep learning [71,72].…”
Section: Codon Optimizationmentioning
confidence: 99%
“…12,16,20,23–25 However, other methods have been proposed. 18,26,27 While classical approaches such as GAs can be highly performant, the fraction of solution space that is sampled in a fixed number of iterations decreases exponentially as the polypeptide chain length grows. Thorough sampling of the solutions space is therefore often intractable with biologically relevant use-cases.…”
Section: Introductionmentioning
confidence: 99%
“…Multiple factors are known to influence the outcome of recombinant protein production. These include codon usage of the gene (Fu et al, 2020), expression vector and plasmid design (Rosano and Germán, 2019), host strain design and optimizations, growth media and cultivation conditions, as well as protein recovery method (Zhang et al, 2020). In addition, some proteins can be toxic to the host or aggregate in inclusion bodies (Rosano and Germán, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…However, due to the variation in natural proteins, this is not always possible. To handle the variations, multiple growth media and cultivation conditions can be explored, as can optimizations of the genes codon usage to better match the codon usage of the recombinant host (Fu et al, 2020). The above factors and variability in the expression system are expected to have significant impact on the protein expression outcome, and strategies for selecting genes more like to express are needed.…”
Section: Introductionmentioning
confidence: 99%