2023
DOI: 10.3389/fmicb.2023.1215609
|View full text |Cite
|
Sign up to set email alerts
|

EVMP: enhancing machine learning models for synthetic promoter strength prediction by Extended Vision Mutant Priority framework

Abstract: IntroductionIn metabolic engineering and synthetic biology applications, promoters with appropriate strengths are critical. However, it is time-consuming and laborious to annotate promoter strength by experiments. Nowadays, constructing mutation-based synthetic promoter libraries that span multiple orders of magnitude of promoter strength is receiving increasing attention. A number of machine learning (ML) methods are applied to synthetic promoter strength prediction, but existing models are limited by the exc… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 27 publications
0
6
0
Order By: Relevance
“…For example, when applied to predict the strength of trc promoters, our model achieved an R-squared (R2) value of 0.77, signifying a substantial improvement over Zhao et al's method (0.53) [29] and six EVMP-based algorithms (0.65 for the best one) [30]. This underscores the considerable superiority of our model structure.…”
Section: Introductionmentioning
confidence: 62%
See 1 more Smart Citation
“…For example, when applied to predict the strength of trc promoters, our model achieved an R-squared (R2) value of 0.77, signifying a substantial improvement over Zhao et al's method (0.53) [29] and six EVMP-based algorithms (0.65 for the best one) [30]. This underscores the considerable superiority of our model structure.…”
Section: Introductionmentioning
confidence: 62%
“…Their method scored an R2 of 0.53. We also evaluated six EVMP-based algorithms developed in [30] and obtained R2 values ranging from 0.62 to 0.65. This demonstrates that our model achieved an overall improvement of 18% compared to the previously best- showcasing that our model could utilize general information from Task1 to make more specific predictions such as Task2.…”
Section: Performance On Two Prediction Tasksmentioning
confidence: 99%
“…The further growth of libraries above 10 8 elements requires the application of synthetic DNA fragments not related to known DNA genome sequences [46]. However, the analysis of such large libraries usually employs ML or DL to identify relations between DNA sequence properties and promoter activity [46,50,[241][242][243]251,262,263]. To improve these ML and DL approaches, they are trained on synthetic, random DNA fragments to test a larger sequence space; models trained on such synthetic data can predict genomic activity better than those solely trained on genome DNA [46,251].…”
Section: Discussionmentioning
confidence: 99%
“…The EVMP uses better mutation information through the equivalent transformation of synthetic promoters into base promoters which are input into BaseEncoder and corresponding k-mer mutations, which are entered into VarEncoder. Therefore, the EVMP can significantly improve the prediction accuracy of promoter strength [243]. Advanced ML algorithms produce highly accurate models of gene expression, uncovering novel regulatory features in nucleotide sequences involving multiple cis-regulatory regions across whole genes and DNA structural properties.…”
Section: Machine Learning and Deep Learning Support To Synthetic Prom...mentioning
confidence: 99%
“…The EVMP uses be er mutation information through the equivalent transformation of synthetic promoters into base promoters and corresponding k-mer mutations, which are input into BaseEncoder and VarEncoder, respectively. Therefore, the EVMP can significantly improve the prediction accuracy of promoter strength [243].…”
Section: Machine Learning and Deep Learning Support To Synthetic Prom...mentioning
confidence: 99%