2022
DOI: 10.48550/arxiv.2212.09741
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

One Embedder, Any Task: Instruction-Finetuned Text Embeddings

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…Contrastive learning has also been adopted as a pre-training objective for sentence representation learning. (Wang et al 2022b;Su et al 2022) Compared to contrastive learning, generative learning approaches are less investigated in the field of self-supervised sentence representation learning. Generative sentence representation learning attempts to generate original sentences from their corrupted or masked version (Yang et al 2020;Wang, Reimers, and Gurevych 2021).…”
Section: Self-supervised Learningmentioning
confidence: 99%
“…Contrastive learning has also been adopted as a pre-training objective for sentence representation learning. (Wang et al 2022b;Su et al 2022) Compared to contrastive learning, generative learning approaches are less investigated in the field of self-supervised sentence representation learning. Generative sentence representation learning attempts to generate original sentences from their corrupted or masked version (Yang et al 2020;Wang, Reimers, and Gurevych 2021).…”
Section: Self-supervised Learningmentioning
confidence: 99%
“…Our Mistral 7B and LLaMA 2 models use InstructorEmbedding [43] for embedding text. The instruction we give is "Represent the scientific text:" as we are working only with scientific texts or documents.…”
Section: Recommendation Modulementioning
confidence: 99%
“…Using the transformer-based technique, specifically the IN-STRUCTOR (Su et al 2022) model, we generate embeddings 8 of 768 dimensions with a float 32-bit data type. This state-of-the-art model is optimized for diverse downstream tasks, for example, classification, clustering, and semantic similarity.…”
Section: Data Embeddingsmentioning
confidence: 99%