2024
DOI: 10.1101/2024.08.24.609531
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benchmarking text-integrated protein language model embeddings and embedding fusion on diverse downstream tasks

Young Su Ko,
Jonathan Parkinson,
Wei Wang

Abstract: Protein language models (pLMs) have traditionally been trained in an unsupervised manner using large protein sequence databases with an autoregressive or masked-language modeling training paradigm. Recent methods have attempted to enhance pLMs by integrating additional information, in the form of text, which are referred to as “text+protein” language models (tpLMs). We evaluate and compare six tpLMs (OntoProtein, ProteinDT, ProtST, ProteinCLIP, ProTrek, and ESM3) against ESM2, a baseline text-free pLM, across … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 34 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?