2024
DOI: 10.1021/jacs.3c10941
|View full text |Cite
|
Sign up to set email alerts
|

Improving Protein Expression, Stability, and Function with ProteinMPNN

Kiera H. Sumida,
Reyes Núñez-Franco,
Indrek Kalvet
et al.

Abstract: Natural proteins are highly optimized for function but are often difficult to produce at a scale suitable for biotechnological applications due to poor expression in heterologous systems, limited solubility, and sensitivity to temperature. Thus, a general method that improves the physical properties of native proteins while maintaining function could have wide utility for protein-based technologies. Here, we show that the deep neural network ProteinMPNN, together with evolutionary and structural information, p… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
27
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 72 publications
(27 citation statements)
references
References 39 publications
0
27
0
Order By: Relevance
“…There, a more conservative cutoff of 7 Å for fixing the ligand-proximal amino acids during reengineering was chosen, but given the larger size of myoglobin, resulting designs had 41%-55% sequence identity with the most similar protein in the UniRef100 database (Sumida et al, 2024). Several examples where protein-or peptide-binding proteins (TEV protease, ubiquitin, ghrelin receptor) were reengineered using Pro-teinMPNN similarly display high success rates (de Haas et al, 2023;Goverde et al, 2023;Sumida et al, 2024). Finally, new methods called LigandMPNN and CARBo-nAra were recently described that explicitly model nonprotein components, but their codes are not yet readily available (Dauparas et al, 2023;Krapp et al, 2023;Krishna et al, 2023).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…There, a more conservative cutoff of 7 Å for fixing the ligand-proximal amino acids during reengineering was chosen, but given the larger size of myoglobin, resulting designs had 41%-55% sequence identity with the most similar protein in the UniRef100 database (Sumida et al, 2024). Several examples where protein-or peptide-binding proteins (TEV protease, ubiquitin, ghrelin receptor) were reengineered using Pro-teinMPNN similarly display high success rates (de Haas et al, 2023;Goverde et al, 2023;Sumida et al, 2024). Finally, new methods called LigandMPNN and CARBo-nAra were recently described that explicitly model nonprotein components, but their codes are not yet readily available (Dauparas et al, 2023;Krapp et al, 2023;Krishna et al, 2023).…”
Section: Discussionmentioning
confidence: 99%
“…In a paper submitted after this one, Sumida and colleagues demonstrate that ProteinMPNN can be used to reengineer another ligand-binding colored protein, human myoglobin, with a comparable success rate (Sumida et al, 2024). There, a more conservative cutoff of 7 Å for fixing the ligand-proximal amino acids during reengineering was chosen, but given the larger size of myoglobin, resulting designs had 41%-55% sequence identity with the most similar protein in the UniRef100 database (Sumida et al, 2024). Several examples where protein-or peptide-binding proteins (TEV protease, ubiquitin, ghrelin receptor) were reengineered using Pro-teinMPNN similarly display high success rates (de Haas et al, 2023;Goverde et al, 2023;Sumida et al, 2024).…”
Section: Discussionmentioning
confidence: 99%
“…The performance of our zero-shot library contributes to a growing body of evidence showing that samples from models fit on natural sequences can be used to generate libraries that are not only enriched for functional variants 82,97,98 but also contain variants with improved fitness 42,59,[99][100][101][102][103][104][105] , even though the zero-shot sampling process did not take the target phenotype of the engineering campaign into account. We encourage further exploration of these techniques for initial library design 63,106 , particularly in lower-throughput settings where improving hit rates can increase the chance of finding at least one satisfactory variant (Figure 5b).…”
Section: Discussionmentioning
confidence: 99%
“…Importantly, our ML campaign outperformed two directed evolution approaches that used the same platform: one that was run independently and using standard in-vitro techniques for hit selection and diversification and one that was designed in-silico and pooled with the ML-designed libraries for screening. Finally, the performance of our zero-shot library contributes to a growing body of evidence showing that samples from models fit on natural sequences can be used to generate libraries that are not only enriched for functional variants [76][77][78] but also contain variants with improved fitness 41,59,[79][80][81][82][83][84][85] .…”
Section: Discussionmentioning
confidence: 99%
“…It consists of 3 encoder layers that encode the backbone coordinates of the input protein, followed by 3 decoder layers that predict a sequence in seconds and in an autoregressive manner [14]. It is a powerful tool that has been successfully applied to many protein design problems, including the de novo design of new folds [16], protein binders [17] and enzymes [18], and the redesign of native proteins [19]. However, DL design tools have met limited success for protein folds mainly composed of antiparallel β-sheets [20].…”
Section: Introductionmentioning
confidence: 99%