2017
DOI: 10.1038/s41598-017-10546-0
|View full text |Cite
|
Sign up to set email alerts
|

Predicting synonymous codon usage and optimizing the heterologous gene for expression in E. coli

Abstract: Of the 20 common amino acids, 18 are encoded by multiple synonymous codons. These synonymous codons are not redundant; in fact, all of codons contribute substantially to protein expression, structure and function. In this study, the codon usage pattern of genes in the E. coli was learned from the sequenced genomes of E. coli. A machine learning based method, Presyncodon was proposed to predict synonymous codon selection in E. coli based on the learned codon usage patterns of the residue in the context of the s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0
5

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 36 publications
(20 citation statements)
references
References 54 publications
0
15
0
5
Order By: Relevance
“…The training gene sequences were translated, and were also split into window sizes of five or seven amino acids, and searched against the corresponding CSI files. For each fragment, the matched score ( s ), expected maximal score ( m ) of the target fragment, and the matched percent ( p , p = s / m ) against the CSI file were calculated by the method described in [20]. For a given cut-off level ( c ), if the calculated matched percent of multiple fragments from the CSI file for a fragment was greater than the cut-off level, the coding vector for the middle codon in the fragment was the arithmetic average of those vectors encoding the selected multiple fragments.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The training gene sequences were translated, and were also split into window sizes of five or seven amino acids, and searched against the corresponding CSI files. For each fragment, the matched score ( s ), expected maximal score ( m ) of the target fragment, and the matched percent ( p , p = s / m ) against the CSI file were calculated by the method described in [20]. For a given cut-off level ( c ), if the calculated matched percent of multiple fragments from the CSI file for a fragment was greater than the cut-off level, the coding vector for the middle codon in the fragment was the arithmetic average of those vectors encoding the selected multiple fragments.…”
Section: Methodsmentioning
confidence: 99%
“…Machine-learning models were constructed by the random forest classification to predict a selection of synonymous codons (low- or high-frequency-usage codon) for the target gene. Compared with the early version of Pyesyncodon, which could only design the gene to be efficiently expressed in E. coli in local [20], this new version could design new genes from protein sequences for optimal expression in three recombinant hosts ( E. coli , B. subtilis , and S. cerevisiae ) on the web; and the training dataset has been updated with more genomes. Therefore, this method will be easily and efficiently used to design genes for heterologous gene expression in the three popular expression hosts ( E. coli , B. subtilis , and S. cerevisiae ).…”
Section: Introductionmentioning
confidence: 99%
“…VP1, VP2 and VP3 sequences of the DWV LN8/17 strain were analyzed and optimized based on E. coli-preferred codons without changing the amino acid sequence of the corresponding proteins (Gao et al, 2015;Mansouri et al, 2013;Wang et al, 2012). High-frequency-usage codons in E. coli were the most commonly used for each of the individual amino acids (Tian et al, 2017). VP1, VP2 and VP3 were subsequently reverse-translated by applying the single-most commonly used codon for each amino acid such that the final codon-optimized genes were represented by the most likely nondegenerate coding sequence and online optimization software (http://www.jcat.de/ and http://genomes.urv.es/OPTIMIZER/) were utilized for codon design.…”
Section: Codon Optimization and Construction Of Recombinant Expressiomentioning
confidence: 99%
“…Some of these approaches provide a tool without experimentally testing its efficacy (Rodriguez et al, 2018). In other cases, investigators do experimentally test their predictions, sometimes verifying the tool (Tian et al, 2017) and sometimes finding that the tool has more limited efficacy (Mignon et al, 2018). There are also experimental methods leveraging directed evolution to improve protein folding in vivo, as reviewed recently (Sachsenhauser and Bardwell, 2018).…”
Section: Difficulties and Recent Developments In Using Heterologous Hmentioning
confidence: 99%