2016
DOI: 10.1007/978-3-319-49004-5_14
|View full text |Cite
|
Sign up to set email alerts
|

Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness Using Machine Translation

Abstract: This paper provides a comparative analysis of the performance of four state-of-the-art distributional semantic models (DSMs) over 11 languages, contrasting the native language-specific models with the use of machine translation over English-based DSMs. The experimental results show that there is a significant improvement (average of 16.7% for the Spearman correlation) by using state-of-the-art machine translation approaches. The results also show that the benefit of using the most informative corpus outweighs … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
3
2
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…The constructed paradigmatic and syntagmatic embeddings successfully dissociates two semantic principles as shown by the word-pair semantic proximity ranking task (Table II), measured with paradigmatic-/syntagmatic-specific benchmarks. The French benchmark data are translated [44] from Out of Vocabulary .002 .024…”
Section: A Twofold Dissociation In Embeddingsmentioning
confidence: 99%
“…The constructed paradigmatic and syntagmatic embeddings successfully dissociates two semantic principles as shown by the word-pair semantic proximity ranking task (Table II), measured with paradigmatic-/syntagmatic-specific benchmarks. The French benchmark data are translated [44] from Out of Vocabulary .002 .024…”
Section: A Twofold Dissociation In Embeddingsmentioning
confidence: 99%
“…The next embedding evaluation we consider is a word similarity task. The ground truth data consists of pairs of words and a human-annotated similarity score averaged across all human evaluations from a translation of the WS-353 dataset (Freitas et al, 2016). The scores are computed via the cosine similarity between the vector representation of each word in a pair.…”
Section: Word Similaritymentioning
confidence: 99%
“…The method is the sum of the vectors of multi-word expressions. To compute relatedness between vectors, we use the Indra implementation (Freitas et al, 2016) of the cosine similarity metric. The system computes cosine similarity for all possible pairwise combinations of tokens in each message.…”
Section: Computing Semantic Relatednessmentioning
confidence: 99%