2018
DOI: 10.48550/arxiv.1802.05365
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep contextualized word representations

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
559
0
6

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 460 publications
(566 citation statements)
references
References 0 publications
1
559
0
6
Order By: Relevance
“…It origins from pre-training contextual representations, e.g. ELMo [27], ULM-FiT [14], OpenAI [28], etc. The BERT coverts an input sequence (x 1 , ..., x n ) to a sequence of vector representations z = (z 1 , ..., z n ) [35].…”
Section: Bertmentioning
confidence: 99%
“…It origins from pre-training contextual representations, e.g. ELMo [27], ULM-FiT [14], OpenAI [28], etc. The BERT coverts an input sequence (x 1 , ..., x n ) to a sequence of vector representations z = (z 1 , ..., z n ) [35].…”
Section: Bertmentioning
confidence: 99%
“…GloVe presents a regression-based model to predict the conditional probability of a word appearing given another word. Context-aware word embeddings, such as such as Embeddings from Language Model (ELMo) [8] and Bidirectional Encoder Representations from Transformers (BERT) [43], were more recently proposed to generate word representations that better consider the context of the sentence. However, all these embeddings are usually trained on common text corpora [7].…”
Section: Textual Word Embeddingsmentioning
confidence: 99%
“…These classical approaches are linear language modeling approaches and often fail to model the true contextual meaning of text corpora. In contrast, Word2Vec [6], GloVe [7], and ELMO [8] are some of the more modern techniques of contextualizing meanings of text corpora, which incorporate neural networks for non-linear language modelling. However, these models are often trained on datasets derived from Twitter, Wikipedia, or general pieces of text and are therefore not entirely suitable for the analysis of scientific publications due to the existence of domain-specific words in these corpora.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore the same word may have different embedding representations depending on the context the word appear in the request to the chatbot. Recent developments of context-dependent embeddings [10,25] show that systems based on such representations achieve good results in different text classification tasks [26], including fake news detection [27], detection of user satisfaction in chatbot systems and call centers [28,29], document classification [30], and health-care applications [29].…”
Section: Context-dependent: Bertmentioning
confidence: 99%