2023
DOI: 10.1101/2023.01.29.525793
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On Pre-trained Language Models for Antibody

Abstract: Antibodies are vital proteins offering robust protection for the human body from pathogens. The development of general protein and antibody-specific pre-trained language models both facilitate antibody prediction tasks. However, few studies comprehensively explore the representation capability of distinct pre-trained language models on different antibody problems. Here, to investigate the problem, we aim to answer the following key questions: (1) How do pre-trained language models perform in antibody tasks wit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
23
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(23 citation statements)
references
References 50 publications
0
23
0
Order By: Relevance
“…It is expected that using machine learning to decode information in adaptive immune receptor repertoires can transform the prediction of immune responses and accelerate the development of vaccines, therapeutics, and diagnostics [36]. Having demonstrated that both Ablang and ESM2 capture important evolutionary information for both BCRs and TCRs, we shifted our focus to examining the predictive accuracy achieved by PLMs in epitope specificity predictions for TCRs and BCRs, both significant open challenges in immunoinformatics [37,24]. One of the primary limitations of current models for specificity prediction tasks is the scarcity of labeled training data.…”
Section: Comparing Generalist and Domain-specific Embedders For Bcr A...mentioning
confidence: 99%
“…It is expected that using machine learning to decode information in adaptive immune receptor repertoires can transform the prediction of immune responses and accelerate the development of vaccines, therapeutics, and diagnostics [36]. Having demonstrated that both Ablang and ESM2 capture important evolutionary information for both BCRs and TCRs, we shifted our focus to examining the predictive accuracy achieved by PLMs in epitope specificity predictions for TCRs and BCRs, both significant open challenges in immunoinformatics [37,24]. One of the primary limitations of current models for specificity prediction tasks is the scarcity of labeled training data.…”
Section: Comparing Generalist and Domain-specific Embedders For Bcr A...mentioning
confidence: 99%
“…There are a few studies to perform pretraining on protein sequences ( Bepler and Berger 2019 ; Wang et al 2023 ; Zhou et al 2023 ). Bepler and Berger (2019) propose to train an LSTM on protein sequences, which could implicitly incorporate structural information from the global structural similarity between proteins and the contact maps for individual proteins, while STEPS uses novel self-supervised tasks to explicitly model protein structure.…”
Section: Related Workmentioning
confidence: 99%
“…□ Task-adaptive DR Model Pretraining. To pretrain a DR model 𝑔(•; 𝜙) in an unsupervised fashion, we continuously pretrain the language model on the corpus X, using the contrastive learning widely adopted in recent research [11,13,30,36]. Specifically, for each document 𝑑 𝑖 ∈ X, we first split each document into multiple sentences and randomly sample two sentences 𝑑 𝑖,1 , 𝑑 𝑖,2 as the positive pair.…”
Section: Stage-i: Dense Retrieval With Label Namesmentioning
confidence: 99%