2021
DOI: 10.48550/arxiv.2109.06304
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 28 publications
0
6
0
1
Order By: Relevance
“…Importantly, the static-equivalent embeddings produced from contextualized embeddings can be utilized in identical ways as those from older Word2Vec or GloVe models, and also outperform them [9]. Subsequently, novel methods of creating static-equivalents have been described, using continuous bag-of-word approaches [23], phrases [24] and by combining contextual and static embeddings [25], for example. Nevertheless, this study has demonstrated as the n A Proposed Knowledge Discovery Method Utilizing Contextual Word Embeddings Based upon the results of this study, a working knowledge-discovery framework utilizing BERT can be achieved by:…”
Section: Discussionmentioning
confidence: 99%
“…Importantly, the static-equivalent embeddings produced from contextualized embeddings can be utilized in identical ways as those from older Word2Vec or GloVe models, and also outperform them [9]. Subsequently, novel methods of creating static-equivalents have been described, using continuous bag-of-word approaches [23], phrases [24] and by combining contextual and static embeddings [25], for example. Nevertheless, this study has demonstrated as the n A Proposed Knowledge Discovery Method Utilizing Contextual Word Embeddings Based upon the results of this study, a working knowledge-discovery framework utilizing BERT can be achieved by:…”
Section: Discussionmentioning
confidence: 99%
“…We leverage the prior knowledge brought by the BERT model and develop an inferential mapping filer. It first embeds input sequences with arbitrary lengths into vectors with a fixed length [26]. This embedding process can also be understood as mapping textual sequences into a numeric hyperspace.…”
Section: Inferential Mappingmentioning
confidence: 99%
“…This embedding process can also be understood as mapping textual sequences into a numeric hyperspace. Based on studies in [26], this hyperspace is declared to manage to put phrases with similar inferential information closer. E.g., the euclidean distance between "at the gates" and "at the doors" after embedding is 8.805, however, the distance between "at the gates" and "on the campus" is 14.95.…”
Section: Inferential Mappingmentioning
confidence: 99%
“…Phrase-BERT establishes "phrase embeddings" to establish a vectorial, semantic "understanding" between the word, segment, and sentence levels. 16 BERTopic leverages term frequency-inverse document frequency (TF-IDF) and maximal marginal relevance algorithms to cluster semantically similar reports. 17 Figure 6 visually depicts topic modeling output using BERTopic.…”
Section: Option 1: Cluster Then Classifymentioning
confidence: 99%