2023
DOI: 10.1038/s42003-023-05464-z
|View full text |Cite
|
Sign up to set email alerts
|

SaLT&PepPr is an interface-predicting language model for designing peptide-guided protein degraders

Garyk Brixi,
Tianzheng Ye,
Lauren Hong
et al.

Abstract: Protein-protein interactions (PPIs) are critical for biological processes and predicting the sites of these interactions is useful for both computational and experimental applications. We present a Structure-agnostic Language Transformer and Peptide Prioritization (SaLT&PepPr) pipeline to predict interaction interfaces from a protein sequence alone for the subsequent generation of peptidic binding motifs. Our model fine-tunes the ESM-2 protein language model (pLM) with a per-position prediction task to ide… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(11 citation statements)
references
References 30 publications
0
11
0
Order By: Relevance
“…Specifically, we mask 15% of the full sequence, as this percentage has performed well in prior studies [ Devlin et al, 2018 ]. Since fusion oncoproteins represent the interaction of two distinct proteins, we masked amino acids that are likely to participate in PPIs as determined by the output probabilities of SaLT&PepPr [ Brixi et al, 2023 ], which predicts a per-amino acid probability of binding. Our masking strategy is as follows:…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Specifically, we mask 15% of the full sequence, as this percentage has performed well in prior studies [ Devlin et al, 2018 ]. Since fusion oncoproteins represent the interaction of two distinct proteins, we masked amino acids that are likely to participate in PPIs as determined by the output probabilities of SaLT&PepPr [ Brixi et al, 2023 ], which predicts a per-amino acid probability of binding. Our masking strategy is as follows:…”
Section: Methodsmentioning
confidence: 99%
“…Meanwhile, protein language models (pLMs), such as ESM-2 and ProtT5, have been trained on the amino acid sequences of over 250 million proteins, from the exceedingly stable to the intrinsically disordered [ Lin et al, 2023 , Elnaggar et al, 2022 ]. They capture physicochemical, structural, and functional properties of proteins from their sequence alone, and have even been extended to design novel proteins [ Ferruz et al, 2022 , Madani et al, 2023 ] and binders [ Brixi et al, 2023 , Bhat et al, 2023 , Chen et al, 2023 ]. However, these models were not trained on fusion oncoprotein sequences, which are functionally and structurally distinct from their wild-type counterparts due to their altered binding sites and unique breakpoint junctions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, the key for machine learning-based protein structure prediction is the extraction of essential structural features from protein data, which serve as the basis for predicting the threedimensional arrangement of atoms in a protein molecule [44][45][46][47][48][49][50][51][52][53]. Take AlphaFold for example, which is an artificial intelligence system developed by DeepMind, a subsidiary of Alphabet Inc. (Google's parent company).…”
Section: Introductionmentioning
confidence: 99%
“…As an alternative to small molecule-based approaches, peptide-based ligands possess large protein-protein interaction surfaces, making them suitable for targeting any POI. Coupled with the rapid development of structural biology techniques that provide detailed protein-protein structural information, 16,17 mature directed-evolution technologies such as phage and yeast display, 18-20 and emerging computational approaches for rapid discovery of synthetic binding peptides, 6,21-23 peptide-based ligands are ideal for extending the scope of PROTACs to “undruggable” proteins. 24,25 Several Peptide-based Proteolysis Targeting Chimeras (PepTACs) targeting oncoproteins and transcription factors make use of a cationic cell-penetrating peptides 26,27 , cyclic peptides, 28,29 or peptide stapling 30-33 to facilitate cellular uptake.…”
mentioning
confidence: 99%