2023
DOI: 10.1021/acssynbio.3c00261
|View full text |Cite
|
Sign up to set email alerts
|

ProtWave-VAE: Integrating Autoregressive Sampling with Latent-Based Inference for Data-Driven Protein Design

Nikša Praljak,
Xinran Lian,
Rama Ranganathan
et al.
Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 57 publications
0
4
0
Order By: Relevance
“…The performance of our zero-shot library contributes to a growing body of evidence showing that samples from models fit on natural sequences can be used to generate libraries that are not only enriched for functional variants 82,97,98 but also contain variants with improved fitness 42,59,[99][100][101][102][103][104][105] , even though the zero-shot sampling process did not take the target phenotype of the engineering campaign into account. We encourage further exploration of these techniques for initial library design 63,106 , particularly in lower-throughput settings where improving hit rates can increase the chance of finding at least one satisfactory variant (Figure 5b).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The performance of our zero-shot library contributes to a growing body of evidence showing that samples from models fit on natural sequences can be used to generate libraries that are not only enriched for functional variants 82,97,98 but also contain variants with improved fitness 42,59,[99][100][101][102][103][104][105] , even though the zero-shot sampling process did not take the target phenotype of the engineering campaign into account. We encourage further exploration of these techniques for initial library design 63,106 , particularly in lower-throughput settings where improving hit rates can increase the chance of finding at least one satisfactory variant (Figure 5b).…”
Section: Discussionmentioning
confidence: 99%
“…Importantly, our ML campaign outperformed two directed evolution approaches that used the same platform: one that was run independently and using standard in-vitro techniques for hit selection and diversification and one that was designed in-silico and pooled with the ML-designed libraries for screening. Finally, the performance of our zero-shot library contributes to a growing body of evidence showing that samples from models fit on natural sequences can be used to generate libraries that are not only enriched for functional variants [76][77][78] but also contain variants with improved fitness 41,59,[79][80][81][82][83][84][85] .…”
Section: Discussionmentioning
confidence: 99%
“…In pure sequence generation, protein language models (PLMs) can be conditioned by a known enzyme family to generate novel sequences with that function, without direct consideration of structure (Figure D). Models with transformer architectures have generated enzymes such as lysozymes, malate dehydrogenases, and chorismate mutases: for the best models, up to 80% of wet-lab validated sequences expressed and functioned. , Some of these generated sequences have low sequence identity (<40%) to known proteins and may be quite different from those explored by evolution, thus potentially unlocking combinations of properties not found in nature. Variational autoencoders (VAEs) have been used to generate phenylalanine hydroxylases and luciferases, with wet-lab validation achieving 30–80% success rates. ,, Generative adversarial networks (GANs) were also applied to the generation of malate dehydrogenases, with 24% success rate . Alternatively, a diffusion model such as EvoDiff could achieve better coverage of protein functional and structural space during generation .…”
Section: Discovery Of Functional Enzymes With Machine Learningmentioning
confidence: 99%
“…Thanks to the use of deep learning, protein engineering has seen tremendous advances, allowing researchers to design reliable sequence and structure prediction techniques. State-of-the-art fast structure prediction techniques such as AlphaFold2 have played a crucial role in this progress, granting greater access to protein structures and accelerating research involving the design of new proteins and peptides [7][8][9][10][11][12] .…”
Section: Introductionmentioning
confidence: 99%