2022
DOI: 10.48550/arxiv.2205.05789
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RITA: a Study on Scaling Up Generative Protein Sequence Models

Abstract: In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1.2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database. Such generative models hold the promise of greatly accelerating protein design. We conduct the first systematic study of how capabilities evolve with model size for autoregressive transformers in the protein domain: we evaluate RITA models in next amino acid prediction, zero-shot fitness, and enz… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
35
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 28 publications
(35 citation statements)
references
References 22 publications
(32 reference statements)
0
35
0
Order By: Relevance
“…The latter work allows for sequence generation conditioned on tags corresponding to molecular function or taxonomic information. Similar to results in NLP, scaling protein language models to very large sizes seems promising for protein sequences [15].…”
Section: Related Literaturementioning
confidence: 76%
See 1 more Smart Citation
“…The latter work allows for sequence generation conditioned on tags corresponding to molecular function or taxonomic information. Similar to results in NLP, scaling protein language models to very large sizes seems promising for protein sequences [15].…”
Section: Related Literaturementioning
confidence: 76%
“…We explore different architectural choices, regularization schemes and compare our results with a recently published shallow auto-regressive method [32], which we use as a baseline. We also compare on a smaller scale to fine-tuned large protein language models, using RITA [15], and explore how structural predictions from Alphafold 2 correlate with our results.…”
Section: Introductionmentioning
confidence: 99%
“…These are widely used to perform peptide generative with desired properies. [38][39][40][41][42] Linder et al [43] developed a generative method which can explicitly control sequence diversity during training by penalizing any two generated sequences on the basis of similarity. Ferruz et al [44] developed ProtGPT2, a Transformer-based generative model trained on UniRef50 which can generate de novo protein sequences following the principles of natural ones.…”
Section: A Related Workmentioning
confidence: 99%
“…These are widely used to perform peptide generative with desired properies. 3842 Linder et al . [43] developed a generative method which can explicitly control sequence diversity during training by penalizing any two generated sequences on the basis of similarity.…”
Section: Introductionmentioning
confidence: 99%
“…Following the principles of DARK3, ProtGPT2 leveraged a GPT2-like model [85] an trained on the Uniref50 dataset [83], leading to a model able to generate proteins in unexplored regions of the natural protein space, while presenting natural-like properties [16]. RITA [86] included a study on the scalability of generative Transformer models with several model-specific (e.g. perplexity) and application-specific (e.g.…”
Section: Introductionmentioning
confidence: 99%