2020
DOI: 10.1007/978-3-030-51310-8_1
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Subword Embeddings with Open N-grams

Abstract: Using subword n-grams for training word embeddings makes it possible to subsequently compute vectors for rare and misspelled words. However, we argue that the subword vector qualities can be degraded for words which have a high orthographic neighbourhood; a property of words that has been extensively studied in the Psycholinguistic literature. Empirical findings about lexical neighbourhood effects constrain models of human word encoding, which must also be consistent with what we know about neurophysiological … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 17 publications
0
1
0
Order By: Relevance
“…Character n-grams can also be used in a task, regardless of the language. Character n-grams can be used with new languages, and they can also detect rare words that are out of vocabulary (OOV) or misspelled [24][25][26]. According to the findings, n-grams with n = 3 are the most common.…”
Section: Feature Selectionmentioning
confidence: 99%
“…Character n-grams can also be used in a task, regardless of the language. Character n-grams can be used with new languages, and they can also detect rare words that are out of vocabulary (OOV) or misspelled [24][25][26]. According to the findings, n-grams with n = 3 are the most common.…”
Section: Feature Selectionmentioning
confidence: 99%